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1 0 Cross-reference to Related Application 

This application is related to U.S. Patent No. 6,556,978, which is hereby 
incorporated herein by reference. 

Statement Regarding Federally Sponsored Research or Development 

The U.S. Government has a paid-up license in this invention and the right in 
15 limited circumstances to require the patent owner to license others on reasonable terms as 
provided for by the terms of contract nos. F30602-00-2-0534 and F33615-02-C-4032, 
awarded by the Defense Advanced Research Laboratory and the Air Force Research 
Laboratory. 

Background of the Invention 

20 Field of the Invention 

This invention pertains in general to searching and manipulating database 
information in order to identify database elements satisfying specified properties. The 
invention more particularly pertains to manipulating database information to represent 
and solve satisfiability and other types of logic problems such as those involved in 

25 microprocessor verification and testing. 
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Description of the Related Art 

Most computer systems that manipulate large amounts of data store the data 
by using general patterns of which the individual data elements are instances. The last 
5 few years have seen extraordinary improvements in the effectiveness of general-purpose 
algorithms that manipulate these data, as represented by improvements in the 
performance of Boolean satisfiability engines. These techniques have been applied in a 
wide range of application domains such as generative planning, circuit layout, and 
microprocessor verification and testing. 

10 As the application domains become more sophisticated, the amount of 

information manipulated by these systems grows. In many cases, this information is 
represented by the set of ground instances of a single universally quantified axiom, but a 
single axiom such as 
Vxyz. [a(x, y) a b(y,z) -> c(x,z)\ 

15 has d ground instances if d is the size of the domain from which jc, y and z are taken. In 
most cases, the prior art has dealt with the difficulty of managing the large number of 
ground instances by increasing computer memory and by finding axiomatizations for 
which ground theories remain manageably sized. In general, memory and these types of 
axiomatizations are both scarce resources and a more natural solution is desired. There 

20 have been some attempts to manipulate quantified axioms directly, but these attempts 
have been restricted to axioms of a particular form and structure. What is needed is a 
general approach that is capable of utilizing whatever structure exists in the data 



2 



Case 8585 

describing the application domain in order to minimize the memory and reduce the 
dependency on axiomatizations. 

Brief Summary of the Invention 

5 The above need is met by using group theory to represent the data describing 

the application domain. This representation expresses the structure inherent in the data. 
Moreover, group theory techniques are used to solve problems based on the data in a 
manner that uses computational resources efficiently. 

In one embodiment, the data describing the application domain are 
10 represented as sets of database entries, where each set (c,G) includes a database element 
c and a group G of elements g that can act on the database element c to produce a new 
database element (which is typically denoted as g(c)). The present invention uses 
computational techniques to perform database manipulations (such as a query for a 
database entry satisfying certain syntactic properties) on the data in the group theory 
15 representation, rather than on the data in the non-group theory representation (referred to 
herein as the data's "native representation"). 

A database query asks whether the data describing the application domain 
have one or more specified properties, seeks to identify any input data having the 
properties and, in some cases, constructs new database elements. In one embodiment, a 
20 query specifies a Boolean satisfiability problem and seeks to determine whether any 

solutions or partial solutions to the problem exist (and to identify the solutions or partial 
solutions), whether any element of the problem is unsatisfiable given known or 
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hypothesized problem features, or whether any single element of the problem can be used 
to derive new features from existing ones. The input query is typically directed toward 
the native representation of the data and is therefore converted into an equivalent query 
on the data in the group theory representation. 

5 The converted query is executed on the data in the group theory 

representation. In one embodiment, query execution produces a collection of zero or 
more group elements g that can act on database elements c associated to the group 
elements and satisfy the properties specified by the input query. If necessary or desired, 
the results of the query are converted from the group theory representation into the native 
10 representation of the data for the application domain. 

Brief Description of the Drawings 

FIG. 1 is a high-level block diagram of a computer system for storing and 
manipulating data using group theory according to an embodiment of the present 
15 invention; 

FIG. 2 is a high-level block diagram of program modules for storing and 
manipulating data using group theory according to one embodiment of the present 
invention; 

FIG. 3 is a flowchart illustrating steps for performing database manipulations 
20 using group theory according to an embodiment of present invention 



4 



Case 8585 

FIG. 4 is a flowchart illustrating the "formulate query" and "execute query" 
steps of FIG. 3 according to an embodiment of the present invention wherein multiple 
low-level queries are generated from an initial query; 

FIG. 5 is a flowchart illustrating a more detailed view of the "formulate 
query" and "execute query" steps of FIG. 3; and 

FIGS. 6-16 illustrate diagrams representing examples of search spaces utilized 
by an embodiment of the present invention. 

The figures depict an embodiment of the present invention for purposes of 
illustration only. One skilled in the art will readily recognize from the following 
description that alternative embodiments of the structures and methods illustrated herein 
may be employed without departing from the principles of the invention described herein. 

Detailed Description of the Preferred Embodiments 

FIG. 1 is a high-level block diagram of a computer system 100 for storing and 
manipulating database information using group theory according to an embodiment of the 
present invention. Illustrated are at least one processor 102 coupled to a bus 104. Also 
coupled to the bus 104 are a memory 106, a storage device 108, a keyboard 1 10, a 
graphics adapter 1 12, a pointing device 1 14, and a network adapter 1 16. A display 1 18 is 
coupled to the graphics adapter 1 12. Some embodiments of the computer system 100 
have different and/or additional components than the ones described herein. 



Case 8585 

The processor 102 may be any general-purpose processor such as an INTEL 
x86 compatible-, POWERPC compatible-, or SUN MICROSYSTEMS SPARC 
compatible-central processing unit (CPU). The storage device 108 may be any device 
capable of holding large amounts of data, like a hard drive, compact disk read-only 
5 memory (CD-ROM), DVD, etc. The memory 106 holds instructions and data used by the 
processor 102. The pointing device 1 14 may be a mouse, track ball, light pen, touch- 
sensitive display, or other type of pointing device and is used in combination with the 
keyboard 1 10 to input data into the computer system 100. The network adapter 1 16 
optionally couples the computer system 100 to a local or wide area network. 

10 Program modules 120 for storing and manipulating database information 

using group theory according to an embodiment of the present invention are stored on the 
storage device 108, from where they are loaded into the memory 106 and executed by the 
processor 102. Alternatively, hardware or software modules may be stored elsewhere 
within the computer system 100 or on one or more other computer systems connected to 

15 the computer system 100 via a network or other means. As used herein, the term 
"module" refers to computer program instructions, embodied in software and/or 
hardware, for performing the function attributed to the module. In one embodiment of 
the present invention, the operation of the computer system 100 is controlled by the 
LINUX operating system, although other operating systems can be used as well. 

20 An embodiment of the present invention receives data describing objects or 

activities external to the computer system 100 and formulates the data in a representation 
that expresses structure inherent in the data. In a preferred embodiment, the 
representation utilizes a branch of mathematics referred to as "Group Theory" and the 
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data are represented as sets of database entries, where each set (c,G) includes a database 
element c and a group G of elements g that can act on the database element. In one 
embodiment, the group elements g are permutations acting on the associated database 
element c, 

5 The present invention uses computational techniques to perform database 

manipulations (such as a query for a database entry satisfying certain syntactic properties) 
on the data in the group theory representation, rather than on the data in the non-group 
theory representation (referred to herein as the data's "native representation"). These 
database manipulations are more efficient than equivalent manipulations on native or 
10 other representations of the data. 

FIG. 2 is a high-level block diagram of program modules 120 for storing and 
.manipulating database information using group theory according to one embodiment of 
the present invention. Those of skill in the art will recognize that embodiments of the 
present invention can include different modules in addition to, or instead of, the ones 
15 illustrated herein. Moreover, the modules can provide other functionalities in addition to, 
or instead of, the ones described herein. Similarly, the functionalities can be distributed 
among the modules in a manner different than described herein. 

A database module 210 (often referred to as the "database") stores data 
utilized by the present invention. As used herein, the term "database" refers to a 
20 collection of data and does not imply any particular arrangement of the data beyond that 
described herein. The data are within one or more application domains. The data within 
each application domain represent and/or describe one or more objects or activities 



7 



Case 8585 

external to the computer system, including prospective and/or theoretical instances of the 
objects and activities, relevant to the domain. For example, data within a given 
application domain can represent a design for a microprocessor or other digital logic 
device. Likewise, data within a given application domain can represent a series of 
actions that must or can be performed on a construction project for building a complex 
object such as a ship or bridge. In another application domain, the data can represent an 
employee directory containing names, telephone numbers, and reporting paths. Those of 
skill in the art will recognize that the data can belong to a wide variety of application 
domains beyond those described here. 

The data within an application domain have inherent structure due to the 
nature of the domain. This structure can be explicit and/or implicit. The data within each 
application domain are preferably represented using group theory in order to express this 
structure. The group theory representation is contrasted with the data's non-group theory, 
or native, representation where the structure is not necessarily expressed. For example, if 
the application domain describes an employee directory, the data in the native 
representation include a listing of names and reporting paths or perhaps a machine- 
readable (but not group theory-based) equivalent of the listing and paths. The data in the 
group theory representation, on the other hand, are represented as sets of database 
elements c and groups G of group elements g, i.e. (c,G) . Each element g of the group G 
"acts" on its associated database element c (the action is typically denoted as g(c)) to 
produce a new database element. 
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An input/output (I/O) module 212 controls the flow of data into, and out of, 
the computer system 100. In general, the I/O module 212 receives "input data," which 
are data within an application domain and describing one or more objects or activities 
external to the computer system, including prospective and/or theoretical instances of the 
5 objects and/or activities, and "input queries," which are queries asking whether the input 
data have certain properties. The input data received by the I/O module 212 can be 
encoded in either a group theory representation or the data's native representation. 

The input data and/or input queries are provided to the computer system 100 
by, for example, transmission over a computer network, loading from a computer- 
10 readable medium, and/or via the keyboard 1 10, pointing device 1 14, or other interface 
device. The input data are preferably stored in the database 210. The input queries are 
stored in the database 210 or elsewhere within the computer system 100. 

In general, the input data are generated by creating electronic representations 
of objects and/or activities in the pertinent application domains, such as representations 

15 of the aforementioned digital logic device, construction project, or employee directory, 
and providing the electronic representations to the computer system 100. In one 
embodiment, the input data are generated by taking measurements of physical or 
prospective physical objects. The input data can be provided to the computer system 100 
in their native representation or after the data are converted from their native 

20 representation to the group theory representation. 

An input query asks whether the input data have given properties and seeks to 
identify the input data having the properties. For example, a query can ask whether a 
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digital logic device behaves in a specified manner, whether an optimal set of steps for a 
construction project exist, etc. In one embodiment, a query specifies a Boolean 
satisfiability problem and seeks to determine whether any solutions or partial solutions to 
the problem exist (and to identify the solutions or partial solutions), whether any element 
5 of the problem is unsatisfiable given known or hypothesized problem features, or whether 
any single element of the problem can be used to derive new features from existing ones. 
A query can also cause new database elements to be added in response to results of a 
query. For example, a query can save its results in the database 210 in a group theory 
representation for use by subsequent queries. 

10 The data flowing out of the computer system 100 are referred to herein as 

"output data" and represent a result of the operation of an input query on the input data as 
achieved through one or more database manipulations using group theory. The output 
data represent a concrete, tangible, and useful result of the database manipulations and 
can include data resulting from physical transformations of the input data. For example, 

15 the output data can represent a result of a verification and test procedure indicating 
whether a digital logic device contains any logic errors. Similarly, the output data can 
represent a result of a process that determines an optimal ordered sequence of steps for a 
construction project or other multi-step task. A computer system or human being can use 
the output data to redesign the digital logic device to correct any errors, to perform the 

20 steps of the construction project, etc. The computer system 100 outputs the output data 
by, for example, transmitting the data over a computer network, loading the data onto a 
computer-readable medium, displaying the data on a monitor, etc. 

10 
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A database construction module 214 receives input data within an application 
domain from the I/O module 212, converts the data from their native representation into 
the group theory representation (if necessary), and stores the data in the group theory 
representation in the database 210. In one embodiment, the database construction module 
214 converts the data from their native representation to the group theory representation 
by encoding the input data as one or more augmented clauses, where each clause has a 
pair (c,G) including a database element c and a group G of elements g acting on it, so 
that the augmented clause is equivalent to the conjunction of the results of operating on c 
with the elements of G. This encoding expresses the structure inherent in the input data. 

A query formulation module 216 receives the input query from the I/O 
module 212. The input query is typically directed toward the native representation of the 
input data. The query formulation module 216 therefore converts the input query into 
one or more equivalent queries on the input data in the group theory representation. In 
general, a query specifies a search for database elements satisfying a property P. The 
query formulation module 216 converts this query into a search for a pair (g f c) where g is 
a group element and c is a database element, such that g(c) satisfies property P. 

In one embodiment, the input query is a "high-level" query. A high-level 
query is a general question about the input data. Examples of high-level queries are "are 
the data consistent?" and "does the logic device described by the data have errors?" The 
answer to a high-level query is not generally a database element, but rather is information 
derived from the database elements. For example, the answer can be a description of an 
error in a digital logic device or a report that the device does not have any errors. 



11 



Case 8585 

In one embodiment, the query formulation module 216 converts the high-level 
query directly into a group theory based representation. In another embodiment, the 
query formulation module 216 converts the high-level query into multiple "low-level" 
queries. A low-level query corresponds to a search for a database element satisfying a 
particular property, such as "is there a single element of the data in the application 
domain that is unsatisfiable given the assignments made so far?" The query formulation 
module 216 converts each of these low-level queries into a group theory representation. 

In another embodiment, the input query received by the query formulation 
module 216 is a low-level query. The low-level query can be derived from a high-level 
query by an entity external to the module 216. Alternatively, the low-level query can be 
the entire query desired to be executed on the input data. In either case, the query 
formulation module 216 converts the low-level query into one or more queries in the 
group theory representation. 

A query execution module 218 receives the one or more converted queries 
from the query formulation module 216 and executes the queries on the data in the group 
theory representation of the input data in the database 210. The query execution module 
218 preferably uses techniques drawn from computational group theory in order to 
execute the queries efficiently. The result of a query is a set of zero or more pairs (g,c) 
where g is a group element and c is a database element, where each g(c) satisfies the 
property P specified by the query. In one embodiment, the set of answers to the query 
form a subgroup. In this case, the result of the query can be represented using a compact 
representation of the subgroup where a small number of group elements hj. . .h n are 
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described explicitly and the other elements of the subgroup can then be computed by 
combining hj...h n . 

A result construction module 220 receives the results of the one or more query 
executions from the query execution module 218 and converts the results from the group 
theory representation to the native representation of the data. Depending upon the 
application domain, the result produced by the result construction module 220 can be a 
result of a test on a digital logic device, a sequence of steps for a construction project, etc. 
The native representation of the query results is received by the I/O module 212, which 
outputs it as the output data. 

In the simple case where a query executed by the query execution module 218 
produces a set of pairs (g,c) where g is a group element and c is a database element, such 
that each g(c) satisfies property P, the result construction module 220 generates g(c) to 
reconstruct the result of the original query in terms familiar to the native representation of 
the input data. In a more complex case where the query execution module 218 executes 
multiple queries and generates multiple results in response to an input query, one 
embodiment of the result construction module 220 combines the results into a collective 
result answering the input query. For example, a high-level query can lead to multiple 
low-level queries and corresponding results. In this case, an embodiment of the result 
construction module 220 analyzes the results of the low-level queries to generate an 
answer to the high-level query in terms familiar to the native representation of the input 
data. In one embodiment, the result construction module 220 constructs answers to both 
high and low-level queries in terms familiar to the native representation of the input data. 
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This latter embodiment can be used, for example, when the results of the low-level 
queries allow one to understand the answer to the high-level query. 

FIG. 3 is a flowchart illustrating steps for performing database manipulations 
using group theory according to an embodiment of present invention. Those of skill in 
5 the art will recognize that embodiments of the present invention can have additional 

and/or other steps than those described herein. In addition, the steps can be performed in 
different orders. Moreover, depending upon the embodiment, the functionalities 
described herein as being within certain steps can be performed within other steps. In 
one embodiment, the steps described herein are performed by the modules illustrated in 
10 FIG. 2. However, in alternative embodiments, some or all of the steps are performed by 
other entities. 

Initially, the input data and input query are received 3 10. For example, the 
input data can describe a digital logic device and the input query can ask whether the 
device behaves in a specified manner. The input data need not be received 

15 contemporaneously with the input query, and the input data and input query are not 

necessarily received in a particular order. The input data either are already in the group 
theory representation, or are converted 312 from their native representation into the group 
theory representation. In addition, the input query is formulated 314 in terms of the 
group theory representation. The group theory representation of the query is executed 

20 3 16 on the group theory representation of the input data to produce a result. The result is 
then output 318. 
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FIG. 4 is a flowchart illustrating a view of the "formulate query" 314 and 
"execute query" 316 steps of FIG. 3 according to an embodiment where a high-level 
query is reduced into multiple low-level queries. These steps are identified by a broken 
line 320 in FIG. 3. Initially, the low-level queries in the group theory representation are 
5 generated 410 from the high-level query (or from another low-level query as described 
below). A low-level query is executed 412 on the group theory representation of the 
input data to produce a result. This result is returned 414 to the entity performing the 
method and can be used, for example, to create new database elements and/or modify 
existing elements. If 416 there are more low-level queries, the method returns to step 412 
10 and continues to execute. In one embodiment, the results of one or more of the low-level 
queries are used 418 to generate 410 additional low-level queries and the method 
therefore returns to step 410. Once the low-level queries have executed, the method uses 
the results of the low-level queries to generate 420 a result of the high-level query. This 
result is then output 318. 

15 FIG. 5 is a flowchart illustrating a detailed view of the "formulate query" 3 14 

and "execute query" 316 steps of FIG. 3, which can correspond to the "execute low-level 
query" 412 step of FIG. 4 in certain embodiments. The native representation of the input 
query (or low-level queries generated from a high-level query) are generally of the type 
"find element x that satisfies property P." This query is converted 510 into the equivalent 

20 group theory query, which is "find database element c and element g of a group G, such 
that g(c) satisfies property P." Next, the formulated query is executed 512 on the input 
data in the group theory representation to identify any database elements c and group 
elements g where g(c) satisfies P. The zero or more pairs (g,c) that are identified in 

15 
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response to the query are converted from the group theory representation into the native 
representation of the input data. This conversion is performed by using the group 
element g, and the database element c, to construct 514 g(c). The resulting value 516 is 
returned as the result of the query. 

5 The following description includes two sections describing an embodiment of 

the present invention. The first section describes the theory of the present invention. The 
second section describes an implementation of one embodiment. Those of skill in the art 
will recognize that embodiments of the present invention can differ from the ones 
described herein. 
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THEORY 
1 Introduction 

This document describes ZAP, a satisfiability engine that substantially generalizes existing 
tools while retaining the performance characteristics of existing high-performance solvers 
such as zChaff. Those of skill in the art will recognize that the concepts described herein 
can be utilized for problem domains other than satisfiability. The following table describes 
a variety of existing computational improvements to the Davis-Putnam-Logemann-Loveland 
(dpll) inference procedure: 
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exp 


??? 


in P using reasons 
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+ first-order 



The lines of the table correspond to observations regarding existing representations used 
in satisfiability research, as reflected in the labels in the first column: 



1. SAT refers to conventional Boolean satisfiability work, representing information as 
conjunctions of disjunctions of literals (cnf). 

2. cardinality refers to the use of "counting" constraints; if we think of a conventional 
disjunction of literals Vrfi as 

i 

then a cardinality constraint is one of the form 

i 

for a positive integer k. 

3. pseudo-Boolean constraints extend cardinality constraints by allowing the literals in 
question to be weighted: 

i 

Each Wi is a positive integer giving the weight to be assigned to the associated literal. 

4. symmetry involves the introduction of techniques that are designed to explicitly ex- 
ploit local or global symmetries in the problem being solved. 
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5. QPROP deals with universally quantified formulae where all of the quantifications are 
over finite domains of known size. 

The remaining columns in the table measure the performance of the various systems 
5 against a variety of metrics: 

1. Representational efficiency measures the extent to which a single axiom in a pro- 
posed framework can replace many in CNF. For cardinality, pseudo-Boolean and quan- 
tified languages, it is possible that exponential savings are achieved. We argue that 
such savings are possible but relatively unlikely for cardinality and pseudo-Boolean 

10 encodings but are relatively likely for qprop. 

2. /^simulation hierarchy gives the minimum proof length for the representation on 
three classes of problems: the pigeonhole problem, parity problems and clique coloring 
problems. An E indicates exponential proof length; P indicates polynomial length. 
While symmetry-exploitation techniques can provide polynomial-length proofs in cer- 

15 tain instances, the method is so brittle against changes in the axiomatization that we 

do not regard this as a polynomial approach in general. 

3. Inference indicates the extent to which resolution can be lifted to a broader setting. 
This is straightforward in the pseudo- Boolean case; cardinality constraints have the 
problem that the most natural resolvent of two cardinality constraints may not be 

20 one. Systems that exploit local symmetries must search for such symmetries at each 

inference step, a problem that is not believed to be in P. Provided that reasons are 
maintained, inference remains well defined for quantified axioms, requiring only the 
introduction of a (linear complexity) unification step. 

4. Propagation describes the techniques available to draw conclusions from an existing 
25 partial assignment of values to variables. For all of the systems except qprop, Zhang 

and Stickel's watched literals idea is the most efficient mechanism known. This ap- 
proach cannot be lifted to qprop, but a somewhat simpler method can be lifted and 
average-case exponential savings obtained as a result. 

5. Learning reflects the techniques available to save conclusions as the inference proceeds. 
30 In general, relevance-bounded learning is the most effective technique known here. It 

can be augmented with strengthening in the pseudo-Boolean case and with first-order 
reasoning if quantified formulae are present. 

An embodiment of the present invention, referred to herein as "zap" is represented in 
the table as: 
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Some comments axe in order: 

Unlike cardinality and pseudo-Boolean methods, which seem unlikely to achieve ex- 
ponential reductions in problem size in practice, and qprop, which seems likely to 
achieve such reductions, zap is guaranteed to replace a set of n axioms for which the 
requisite structure is present with a single axiom of size log(ra) (Proposition 3.6). 

• In addition to providing robust, polynomially sized proofs of the three "benchmark" 
proof complexity problems cited previously, zap can p-simulate extended resolution, 
so that if P is a proof in extended resolution, there is a proof P r of size polynomial in 
P in zap (Theorem 5.12). 

• The fundamental inference step in zap is in NP with respect to the zap representa- 
tion, and therefore has worst case complexity exponential in the representation size 
(i.e., polynomial in the number of Boolean axioms being resolved). The average case 
complexity appears to be low-order polynomial in the size of the zap representation 
(i.e., polynomial in the logarithm of the number of Boolean axioms being resolved). 

• Zap obtains the savings attributable to subsearch in the qprop case while casting 
them in a general setting that is equivalent to watched literals in the Boolean case. 
This particular observation is dependent on a variety of results from computational 
group theory. 

• In addition to learning the Boolean consequences of resolution, zap continues to sup- 
port relevance-based learning schemes while also allowing the derivation of first-order 
consequences, conclusions based on parity arguments, and combinations thereof. 

The next section summarizes both the dpll algorithm and the modifications that embody 
recent progress, casting dpll into the precise form that is both needed in zap and that seems 
to best capture the architecture of modern systems such as zChaff. 

In a later section, we present the insights underlying ZAP. Beginning with a handful 
of examples, we see that the structure exploited in earlier examples corresponds to the 
existence of particular subgroups of the group of permutations of the literals in the problem; 
this corresponds to the representational efficiency column of our table. The "inference" 
section describes resolution in this broader setting, and the ^simulation hierarchy section 
presents a variety of examples of these ideas at work, showing that the pigeonhole problem, 
clique-coloring problems, and Tseitin's parity examples all admit short proofs in the new 
framework. We also show that our methods p-simulate extended resolution. The learning 
section recasts the dpll algorithm in the new terms and discusses the continued applicability 
of relevance in our setting. 

2 Boolean satisfiability engines 

We begin here by being precise about Davis-Putnam-Logemann-Loveland extensions that 
deal with learning. We give a description of the dpll algorithm in a learning/reason- 
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maintenance setting, and prove that it is possible to implement these ideas while retaining 
the soundness and completeness of the algorithm but using an amount of memory that grows 
polynomially with problem size. 

Procedure 2.1 (Davis-Put nam-Logemann-Loveland) Given a SAT problem C and a 
partial assignment P of values to variables, to compute dpll(C, P): 

1 P <- Unit-Propagate(P) 

2 if P = FAILURE 

3 then return failure 

4 if P is a solution to C 

5 then return success 

6 / <— a literal not assigned a value by P 

7 if dpll(C, Pu{/ = true}) = success 

8 then return success 

9 else return dpll(C, P U {7 = false}) 

Variables are assigned values via branching and unit propagation. In unit propagation 
(described below), the existing partial assignment is propagated to new variables where pos- 
sible. If unit propagation terminates without reaching a contradiction or finding a solution, 
then a branch variable is selected and assigned a value, and the procedure recurs. Here is a 
formal description of unit propagation: 

Procedure 2.2 (Unit propagation) To compute Unit-Propagate(P); 

1 while there is a currently unsatisfied clause c G C that contains at most one literal 
unassigned a value by P 

2 do if every variable in c is assigned a value by P 

3 then return FAILURE 

4 else v <— the variable in c unassigned by P 

5 P <- PL){v = V :V is selected so that c is satisfied} 

6 return P 

We can rewrite this somewhat more formally using the following definition: 

Definition 2.3 Let V^i be a clause, which we will denote by c. Now suppose that P is a 
partial assignment of values to variables. We will say that the possible value of c under P is 
given by 

poss(c,P) = -|{iHj£P}|-l 

If no ambiguity is possible, we will write simply poss(c) instead of poss (c, P). In other 
words, poss(c) is the number of literals that are either already satisfied or not valued by P, 
reduced by one (since the clause requires one literal true be true). 

If S is a set of clauses, we will write poss n (S,P) for the subset of c e S for which 
poss(c,P) < n and poss >n (S,P) for the subset of learned clauses c G S for which poss (c, P) > 
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The above definition can be extended easily to deal with pseudo-Boolean instead of Boolean 
constraints, although that extension will not be our focus here. 

Procedure 2.4 (Unit propagation) To compute Unit-Propagate(P): 

1 while poss 0 (C, P) ^0 

2 do if poss^(C,P)^0 

3 then return failure 

4 else c <— an element of poss 0 (C, P) 

5 v <r- the variable in c unassigned by P 

6 P Pu{v = V :V is selected so that c is satisfied} 

7 return P 

As with Definition 2.3, the above description can be extended easily to deal with pseudo- 
Boolean or other clause types. 

Procedures 2.4 and 2.1 are generally extended to include some sort of learning. For our 
ideas to be consistent with this, we introduce: 

Definition 2.5 A partial assignment is an ordered sequence 

(li, — , l w ) 

of literals. An annotated partial assignment is an ordered sequence 

((/l,Ci),..., (l n ,Cn)) , 

of literals and clauses, where Ci is the reason for literal k and either Ci = true (indicating 
the li was a branch point) or Ci is a clause such that: 

1. li is a literal in Ci, and 

2. poss(Ci, <*!,...,/,_!» = 0 

An annotated partial assignment will be called sound with respect to a set of constraints C 
ifC\=Ci for each reason q. 

The point of the definition is that the reasons have the property that after the literals 
/i, . . . are all set to true, it is possible to conclude U from Ci by unit propagation. 

Definition 2.6 // c\ and oi are reasons, we will define the result of resolving c\ and to 
be: 

{C2, ifci = true; 

ci, ifc^ = true; 

the conventional resolvent of Ci and c 2 , otherwise. 
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We can now rewrite the unit propagation procedure as follows: 

Procedure 2.7 (Unit propagation) To compute Unit-Propagate(P) for an annotated 
5 partial assignment P: 

1 while poss 0 (C,P)^0 



2 do if poss_ 1 (C,P)^0 

3 then c «— an element of poss.^C, P) 

4 It <— the literal in c with the highest index in P 

5 return (true, resolve(c, Cj)) 

6 else c «— an element of poss 0 (C, P) 

7 / <— the variable in c unassigned by P 

8 P<-PU(i,c) 



9 return (false, P) 

The above procedure returns a pair of values. The first indicates whether a contradiction has 
been found. If so, the second value is the reason for the failure, an unsatisfiable consequence 
of the clausal database C. If no contradiction is found, the second value is a suitably modified 
10 partial assignment. Procedure 2.7 has also been modified to work with annotated partial 

assignments, and to annotate the new choices that are made when P is extended. 

Proposition 2.8 Suppose that C is a Boolean satisfiability problem, and P is a sound an- 
notated partial assignment Then: 

1. If unit-propagate (P) = (false, P 1 ), then P* is a sound extension of P, and 

15 2. //unit -propagate (P) = (true,c), then C \= c and c is falsified by the assignments 

in P. 

When the unit propagation procedure "fails" and returns (true, c) for a new nogood c, 
there are a variety of choices that must be made by the overall search algorithm. The new 
clause can be added to the existing collection of axioms to be solved, perhaps simultaneously 
20 deleting other clauses in order to ensure that the clausal database remains manageably sized. 

It is also necessary to backtrack at least to a point where c becomes satisfiable. Some systems 
such as zChaff backtrack further to the point where c is unit. Here is a suitable modification 
of Procedure 2.1: 

Procedure 2.9 Given a SAT problem C and an annotated partial assignment P, to compute 
dpll(C,P): 
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1 if P is a solution to C 

2 then return P 

3 (x, y) <- Unit-Propagate(P) 

4 if x = true 

5 then c <- y 

6 if c is empty 

7 then return failure 

8 else C<-Cu{c} 

9 delete clauses from C as necessary to keep C small 

10 backtrack at least to the point that c is satisfiable 

11 return dpll(C ? P) 

12 else P +-y 

13 if P is a solution to C 

14 then return P 

15 else / <— a literal not assigned a value by P 

16 return dpll(C, (P, (Z, true))) 



This procedure is substantially modified from Procedure 2.1, so let us go through it. 

The fundamental difference is that both unit propagation and the dpll procedure can 
only fail if a contradiction (an empty clause c) is derived. In all other cases, progress is made 
by augmenting the set of constraints to include at least one new constraint that eliminates 
the current partial solution. Instead of simply resetting the branch literal I to take the 
opposite value as in the original procedure 2.1, a new clause is learned and added to the 
problem, which will cause either I or some previous variable to take a new value. 

The above description is ambiguous about a variety of points. We do not specify how the 
branch literal is chosen, the precise point to backtrack to, or the scheme by which clauses are 
removed from C. The first two of these are of little concern to us; zap makes the same choices 
here that zChaff does and implementing these choices is straightforward. Selecting clauses 
for removal involves a search through the database for clauses that have become irrelevant or 
meet some other condition, and the computational implications of this "subsearch" problem 
need to be considered as the algorithm evolves. 

It will be useful for us to make this explicit, so let us suppose that we have some relevance 
bound k. The above procedure now becomes: 

Procedure 2.10 (Relevance-bound reasoning, rbl) Given a SAT problem C and an 
annotated partial assignment P, to compute rbl(C, P): 
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1 if P is a solution to C 

2 then return P 

3 (x, y) <- Unit-Propagate(P) 

4 if x = true 

5 then c <— ?/ 

6 if c is empty 

7 then return failure 

8 else remove successive elements from P so that c is satisfiable 

9 C^CU{c}-poss >fc (C,P) 

10 return rbl(C, P) 

11 else P <- y 

12 if P is a solution to C 

13 then return P 

14 else / <— a literal not assigned a value by P 

15 return rbl(C, (P, (Z, true))) 



Note that the clauses deleted because they belong to poss >A .(C, P) are only learned 
irrelevant clauses (see Definition 2.3); it is inappropriate to remove clauses that are part of 
the original problem specification. 

Theorem 2.11 Rbl is sound and complete, and uses an amount of memory polynomial in 
the size of C (although exponential in the relevance bound k). 

3 Axiom structure as a group 
3.1 Examples of structure 

While we use the implementation details embodied in Procedures 2.10 and 2.7 to imple- 
ment our ideas, the procedures themselves inherit certain weaknesses of dpll as originally 
described. Two weaknesses that we address are: 

1. The appearance of poss 0 (C, P) in the inner unit propagation loop of the procedure 
requires an examination of a significant subset of the clausal database at each inference 
step, and 

2. Both dpll and rbl are fundamentally resolution-based methods; there are known 
problem classes that are exponentially difficult for resolution-based methods but which 
are easy if the language in use is extended to include either cardinality or parity 
constraints. 

Let us consider each of these issues in turn. 
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3.1.1 Subsearch 

The set of axioms that need to be investigated in the dpll inner loop often has structure that 
can be exploited to speed the examination process. If a ground axiomatization is replaced 
with a lifted one, the search for axioms with specific syntactic properties is NP-complete in 
the number of variables in the lifted axiom, and is called subsearch for that reason. 

In many cases, search techniques can be applied to the subsearch problem. As an example, 
suppose that we are looking for unit instances of the lifted axiom 

a(x,y)yb(y,z)Vc{x,z) (1) 

where each variable is taken from a domain of size d, so that (1) corresponds to d 3 ground 
axioms. If a(x, y) is true for all x and y (which we can surely conclude in time o(cP) or less), 
then we can conclude without further work that (1) has no unit instances. If a(x,y) is true 
except for a single (x, y) pair, then we need only examine the d possible values of z for unit 
instances, reducing our total work from d 3 to d 2 + d. 

It will be useful in what follows to make this example still more specific, so let us assume 
that x, y and z are all chosen from a two element domain {A, B}. The single lifted axiom (1) 
now corresponds to the set of ground instances: 

a{A,A) Vb(A,A)Vc(A,A) 
a(A,A)Vb{A,B)V c(A,B) 
a(A, B)vb(B,A)\/c(A,A) 
a(A,B)Vb(B,B)Vc(A,B) 
a(B,A)Vb{A,A)\/c{B,A) 
a(B,A)Vb(A,B)Vc(B,B) 
a{B,B)Vb{B,A)V c(B,A) 
a(B,B)vb(B,B)Vc{B,B) 

If we introduce ground literals li, fe, fa, h for the instances of a(x, y) and so on, we get: 

/1V/5V l 9 
h v Z 6 V l l0 

/2W7VZ9 

h V h V / 10 (2) 
h V h V l n . 
h V Z 6 V l l2 
Z4VZ7VZ11 

at which point the structure implicit in (1) has apparently been obscured. We will return to 
the details of this example shortly. 
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3.1.2 Cardinality 

Structure is also present in the sets of axioms used to encode the pigeonhole problem, which is 
known to be exponentially difficult for any resolution-based method. The pigeonhole problem 
can be solved in polynomial time if we extend our representation to include cardinality axioms 
such as 

xi + ■ - • + x m > k (3) 

The single axiom (3) is equivalent to conventional disjunctions. 

Once again, we will have use for an example presented in full detail. Suppose that we 
have the constraint 

%\ + x 2 + x 3 + x 4 + x b > 3 (4) 
saying that at least 3 of the x^s are true. This is equivalent to 

Xi V x 2 V x 3 
Xi V x 2 V x± 
xi V x 2 V x 5 

Xi V £3 V X4 

xi V x 3 V x 6 (5) 

Xi V X4 V z 5 

x 2 V x 3 V x 4 

x 2 V x 3 V rr 5 

x 2 V rr 4 V x 5 

x 3 V x 4 V x 5 



3.1.3 Parity constraints 

Finally, we consider constraints that are most naturally expressed using modular arithmetic 
or exclusive or's, such as 

x\ © • • • 0 x k = 0 

or 

Xi © • • • © x k = 1 (6) 

In either case, the parity of the sum of the x^'s is specified. 

It is well known that axiom sets consisting of parity constraints in isolation can be solved 
in polynomial time using Gaussian elimination, but there are examples that are exponentially 
difficult for resolution-based methods. As in the other examples we have discussed, single 
axioms such as (6) reveal structure that a straightforward Boolean axiomatization obscures. 
In this case, the single axiom (6) with k = 3 is equivalent to: 

Xi V X 2 V £ 3 
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Xi V -tx 2 V ->x 3 
-irri V^V -1X3 
-uci V —1X2 V £3 



(7) 



3,2 Formalizing structure 

Of course, the ground axiomatizations (2), (5) and (7) cannot erase the structure in the 
original axioms (1), (4) and (6); they can only obscure that structure. Our goal in this 
section is to begin the process of understanding the structure in a way that lets us describe 
it in general terms. 

As a start, note that each of the axiom sets consists of axioms of equal length; it follows 
that the axioms can all be obtained from a single one simply by permuting the literals in 
the theory. In (2) and (5), literals are permuted with other literals of the same sign; in 
(7), literals are permuted with their negated versions. But in every instance, a permutation 
suffices. 

In general, the collection of permutations oh a set L is denoted by Sym(L). If the elements 
of L can be labeled 1, 2, . . . , n in some obvious way, the collection is often denoted simply 
S n . If we take L to be the integers from 1 to n, a particular permutation can be denoted by 
a series of disjoint cycles, so that the permutation 



for example, would map 1 to 3, then 3 to 5, then 5 back to 1. It would also exchange 2 and 
6. The order in which the disjoint cycles are written is irrelevant, as is the choice of first 
element within a particular cycle. 

If ui and uj 2 are two permutations, it is obviously possible to compose them; we will write 
the composition as u)iU) 2 where the order means that we operate on a particular element 
of {l,...,n} first with uji and then with u; 2 . It is easy to see that while composition is 
associative, it is not necessarily commutative. 

As an example, if we compose uj from (8) with itself, we get 



where the second equality holds because disjoint cycles commute and the third holds because 
• (26) 2 is the identity cycle (). 

The composition operator also has an inverse, since any permutation can obviously be 
inverted by mapping x to that y with uj{y) = x. In our running example, it is easy to see 
that 



We see, then, that the set S n is equipped with a binary operator that is associative, 
and has an inverse and an identity element. This is the definition of an algebraic structure 
known as a group. We draw heavily on results from group theory and some familiarity with 



uj= (135)(26) 



(8) 



uj 2 = (135)(26)(135)(26) = (135)(135)(26)(26) = (153) 



uj~ l = (153) (26) 
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group-theoretic notions is helpful for understanding the present invention. One definition 
and some notation we will need: 

Definition 3,1 A subset S of a group G will be called a subgroup of G if S is closed under 
the group operations of inversion and multiplication. We will write S < G to denote the fact 
that S is a subgroup of G, and will write S < G if the inclusion is proper. 

5 For a finite group, closure under multiplication suffices. 

But let us return to our examples. The set of permutations needed to generate (5) from 
the first ground axiom alone is clearly just the set 

Q = Sym({x u x 2 ,x 3 ,x^x b }) (9) 

since these literals can be permuted arbitrarily to move from one element of (5) to another. 
10 Note that the set Q in (9) is a subgroup of the full permutation group S 2n on 2n literals in 

n variables, since Q is easily seen to be closed under inversion and composition. 

What about the example (7) involving a parity constraint? Here the set of permutations 
needed to generate the four axioms from the first is given by: 

(x u ^x 1 )(x 2 ^x 2 ) (10) 
15 (^1,^1)^3,^3) (11) 

(£2,^2) (£3,-^3) (12) 

Although literals are now being exchanged with their negations, this set, too, is closed under 
the group inverse and composition axiom. Since each element is a composition of disjoint 
transpositions, each element is its own inverse. The composition of the first two elements is 
20 the third. 

The remaining example (2) is a bit more subtle; perhaps this is to be expected, since the 
axiomatization (2) obscures the underlying structure far more effectively than does either 
(5) or (7). 

To understand this example, note that the set of axioms (2) is "generated" by a set of 
25 transformations on the underlying variables. In one transformation, we swap the values of 

A and B for x, corresponding to the permutation 

(a(A, A), o(fl, A))(a(A, B), 0 (fl, B)) {c(A, A), c(S, A))(c(A, 5), c(B, B)) 

where we have included in a single permutation the induced changes to all of the relevant 
ground literals. (The relation b doesn't appear because b does not have x .as an argument in 
30 (1).) In terms of the literals in (7), this becomes 

Ux = (ilis)(W4)(Wll)(iloil2) 

In a similar way, swapping the two values for y corresponds to the permutation 

^ = (iifc)(W(Wr)(i6W 
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and z produces 

U z = (W(W(Ml0)(Jll*12) 

Now consider the subgroup of Sym({/j}) that is generated by u xi w y and w z . We will 
follow the usual convention and denote this by 

ft = (Vx,Uy,Uz) (13) 

5 Proposition 3.2 The image of any single clause in the set (2) under fi as in (13) is exactly 

the complete set of clauses (2). 

As an example, operating on the first axiom in (2) with uj x produces 

h V k V hi 

This is the fifth axiom, exactly as it should be, since we have swapped a(A, A) with a(B, A) 
10 and c(A,A) with c(B,A). 

Alternatively, a straightforward calculation shows that 

U) x U) y = (/li4)(^i3)(i5^)(W8)(Wll)Gl0^12) 

and maps the first axiom in (5) to the next-to-last, the second axiom to last, and so on. 
It should be clear at this point what all of these examples have in common. In every case, 

15 the set of ground instances corresponding to a single non-Boolean axiom can be generated 

from any single ground instance by the elements of a subgroup of the group S 2n of permu- 
tations of the literals in the problem. The fact that only a tiny fraction of the subsets of S 2n 
is closed under the group operations and therefore a subgroup suggests that this subgroup 
property in some sense captures and generalizes the general idea of structure that underlies 

20 our motivating examples. 

Note, incidentally, that some structure is surely needed here; a problem in random 3- 
SAT for example, can always be encoded using a single 3-literal clause c and then that 
set of permutations needed to recover the entire problem from c in isolation. There is no 
structure because the relevant subset of S 2n has no structure. The structure is implicit in 

25 the requirement that the set Q, used to produce the clauses be a group; as we will see, this 

structure has just the computational implications needed if we are to lift rbl and other 
Boolean satisfiability techniques to this broader setting. 

Let us also point out the surprising fact that the subgroup idea captures all of the 
structures discussed in the introduction. It is not surprising that the various structures used 

30 to reduce proof size all have a similar flavor, or that the structure used to speed the inner 

loop be uniform. But it strikes us as remarkable that these two types of structure, used for 
such different purposes, are in fact instances of a single framework. 

Instead of generalizing the language of Boolean satisfiability as seems required by the 
range of examples we have considered, it suffices to annotate ground clauses with the Q 

35 needed to reproduce a larger axiom set. Before we formalize this, however, let us note that 

any "reasonable" permutation that switches l\ and l 2 should respect the semantics of the 
axiomatization and switch and -i/ 2 as well. 
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Definition 3.3 Given a set of n variables, we will denote by W n that subgroup of S 2 n that 
swaps -ili and ->/ 2 whenever it swaps li and l 2 . 

Informally, an element of W n corresponds to a permutation of the n variables, together 
with a choice to flip some subset of them. 

Proposition 3.4 W n is the wreath product of S 2 and S n , typically denoted S 2 I S n . 

We are now in a position to state: 

Definition 3.5 An augmented clause in an n-variable Boolean satisfiability problem is a 
pair (c, G) where c is a Boolean clause and G < W n . A ground clause d is an instance of 
an augmented clause (c, G) if there is some g G G such that d = g(c). 

The sections below demonstrate that augmented clauses have the following properties: 

1. They can be represented compactly, 

2. They can be combined efficiently using a generalization of resolution, 

3. They generalize existing concepts such as quantification over finite domains, 
cardinality, and parity constraints, together with providing natural gener- 
alizations for proof techniques involving such constraints and for extended 
resolution, 

4. Rbl can be extended with little or no computational overhead to manipulate 
augmented clauses instead of ground ones, and 

5. Propagation can be computed efficiently in this generalized setting. 
3.3 Efficiency of representation 

For the first point, the fact that the augmentations G can be represented compactly is a 
consequence of <7's group structure. In the example surrounding the reconstruction of (5) 
from (9), for example, the group in question is the full symmetry group on m elements, 
where m is the number of variables in the cardinality constraint. In the lifting example (7), 
we can describe the group in terms of the generators u x , u) y and cj z instead of listing all eight 
elements that the group contains. In general, we have: 

Proposition 3.6 Let S be a set of ground clauses, and (c, G) an equivalent augmented 
clause. Then a set of generators for G — (oj u . . . ,0;*) can be found in polynomial time such 
that k < log 2 
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Let us make a remark regarding computational complexity. Essentially any group- 
theoretic construct can be computed in time polynomial in the group size; basically one 
simply enumerates the group and evaluates the construction (generate and test, as it were). 
What is interesting is the collection of group constructions that can be computed in time 
polynomial in the number of generators of the group. In our case, this corresponds to polylog 
5 time in the number of instances of the augmented clauses involved. 

In any event, Proposition 3.6 is the first of the results promised: In any case where n 
Boolean axioms can be captured as instances of an augmented clause, that augmented clause 
can be represented using 0(log 2 n) generators. 

The proof of Proposition 3.6 requires the following result from group theory: 

10 Theorem 3.7 (Lagrange) If G is a finite group and S <G, then \S\ divides \G\. 

Corollary 3.8 Any augmented clause in a theory containing n literals can be expressed in 
0(n 2 log 2 n) space. 

In fact, Corollary 3.8 can be strengthened using: 

Proposition 3.9 Any subgroup of S n can be described in polynomial time using at most 
15 0(n) generators. 

This reduces the 0(n 2 log 2 n) in the corollary to simply 0(n 2 ). 

4 Inference 
4.1 Resolution 

In this section, we begin the process of discussing derivations based on augmented clauses 
20 instead of ground ones. We begin with a few preliminaries: 

Definition 4.1 Two augmented clauses (c u Gi) and (c2,G 2 ) will be called equivalent if they 
have identical sets of instances. This will be denoted (ci,Gi) = (c2,G 2 ). 

Proposition 4.2 Let (c, G) be an augmented clause. Then if d is any instance of (c, G), 
(c,G) = (J,G). ' . 

25 Proposition 4.3 For ground clauses C\ and c 2 and a permutation u G W n , 

resolve{u){ci),u){c2)) = u(resolve(c u c 2 )) 

Definition 4.4 IfC is a set of augmented clauses, we will say that C entails an augmented 
clause (c, G), writing C f= (c, G), if every instance of (c, G) is entailed by the set of instances 
30 of the augmented clauses in C. 
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Lemma 4.5 If Gi and <7 2 are subgroups ofG, so is Gi n G%. 

We are now in a position to consider lifting the idea of resolution to our setting, but let 
us first discuss the overall intent of this lifting and explain an approach that doesn't work. 

What we would like to do is to think of an augmented clause as having force similar to all 
of its instances; as a result, when we resolve two augmented clauses (ci,Gi) and (c 2 ,G 2 ), we 
would like to obtain as the (augmented) resolvent the set of all resolutions that are sanctioned 
by resolving an instance of (c u Gi) with one of (c^, G 2 ). At a minimum, we can certainly 
conclude (c, Gi n G 2 ) where c is the conventional resolvent of C\ and c 2 , since every instance 
corresponds to a permutation that is sanctioned by the individual augmented resolvents. Is 
this good enough? 

Consider an example. Suppose that there are four variables in our problem, a, 6, c and 
d and that we are resolving the two clauses 

(ovM(8c)» 
which has instances a V b and a V c and 

(-aVd,<» 

which has the single instance -na V d. We will write these somewhat more compactly as 

(aV6,(6c)) (14) 

and 

(-ioVd,l) (15) 
respectively. If we resolve and intersect the groups, we conclude 

(b V d, 1) 

If we resolve the clauses individually, however, we see that we should be able to derive 
the pair of clauses b V d and c V d; in other words, the augmented clause 

(6Vd,(6c)) (16) 

It certainly seems as if it should be possible to capture this in our setting, since the clause 
in (16) is just the resolvent of the clauses appearing in (14) and (15). Where does the group 
generated by (be) come from? 

If we want to retain the idea of intersecting the groups in the original clauses, the most 
natural approach seems to be to recognize that neither b nor c appears in (15), so that we 
can rewrite (15) in the equivalent form 

(-.a V d, (be)) (17) 

because exchanging b and c in -.a V d has no effect at all. The group (be) in (16) is now the 
intersection of the groups in (14) and (17), as desired. 
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Unfortunately, there are two problems with this idea. The simpler is that it seems 
inappropriate to require that an augmented version of the clause noVc! refer explicitly to 
the remaining variables in the theory (b and c). Surely the representation of a clause should 
be independent of the problem of which that clause is a part. 

More serious, however, is that the approach that we have just given simply doesn't work. 
Suppose that we augment our existing theory with a fifth variable e. Now assuming that we 
keep track of all of the unmentioned variables in the augmented clauses, (14) becomes 

(oV6,(te) x Wde) (18) 

where W de denotes the group that exchanges d and e arbitrarily and may flip either (or both) 
of them as well. We continue to be able to exchange b and c, of course. In an analogous way, 
(15) becomes 

Uvrf,^ M ) (19) 

where we indicate that we are free to swap 6, c and e in any way consistent with Definition 3.3 
of the Wi. 

It is not hard to see that 

w {bce} n ((be) x W de ) = (be) x w e 
so that the result of resolving (18) and (19) would be 

(b V d, (be) x W e ) 

This appears to be successful, but is not, because the resolvent needs instead to be 

(bVd,(be) xW ae ) 

if we are to continue to resolve with other augmented clauses. We have lost the implicit 
symmetry resulting from the fact that a has been eliminated from the resolved clause. 
The difficulties become clearer if we imagine resolving 

(aV6,Sym({6crf») 

corresponding to the three clauses a V 6, a V c and a V d, with 

(^aVe,(de)) 

corresponding to -»a Ve and -naVd. The clauses that are sanctioned by the resolution should 
be 

bVd 
bye 
cVrf 
cVe 
dye 
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where we have not included d V d because it would correspond to a permutation where both 
b and e are mapped to d. 

The difficulty is that there is no subgroup of the permutation group that concisely cap- 
tures the above clauses. It appears that the best that we can do is the following: 

Definition 4.6 Let u be a permutation and S a set Then by u)\ s we will denote the result 
of restricting the permutation to the given set 

Definition 4.7 For Ki C L and Gi < Sym(L), we will say that a permutation u G Sym(L) 
is an extension of {Gi} if there are g { G Gi such that for all i, u\ K . = gi\ Kr We will denote 
the set of extensions of {Gi} by extn(l^, Gi). 

The extensions need to simultaneously extend elements of all of the individual groups 
acting on the various subsets 

Definition 4.8 Suppose that (c u Gi) and {pi,G 2 ) are augmented clauses. Then a resolvent 
of (ci,Gi) and {oz,G 2 ) is any augmented clause of the form (resolve(c u C2),G) where G < 
(extn{ci,Gi)nW n ). 

Proposition 4.9 Augmented resolution is sound, in that 

(cuG l )A(c 2 ,G 2 )\=(c,G) 
for any (c, G) that is a resolvent of (ci, G\) and (c 2 ,G 2 ). 
We also have: 

Proposition 4.10 // extn(ci, Gi) n W n is a subgroup of W n , then augmented resolution is 
complete in the sense that 

(resolve(c u C2), extn(cj, Gi) fl W n ) (= resolve(u)(ci),u(c 2 )) 
for any permutation of literals u G W n such that u\ ci G Gi and u)\ C2 G G 2 . 

In general, however, the set of extensions need not be a subgroup of W n , and we would 
like a less ambiguous construction. To that end, we can modify Definition 4.7 as follows: 

Definition 4.11 For Ki<ZL and Gi < Sym(L) ; we will say that a permutation uj G Sym(L) 
is a stable extension of {Gi} if there are gi G G { such that for all i, w^Ki) = gildi^)- We 
will denote the set of stable extensions of {d} by stab(i^, Gi). 

This definition is modified from Definition 4.7 only in that the restriction of u is not just to 
the original variables in K { but to Gi{Ki), the image of K { under the action of the group 

Proposition 4.12 stab^, G t ) < Sym(L). 
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30 In other words, stab^, Gi) is a subgroup of Sym(L). 

Definition 4.13 Suppose that (c u Gi) and (c^ G 2 ) are augmented clauses. Then the canon- 
ical resolvent of(c u Gi) and (c2,G 2 ), to be denoted by resolve((c u Gi), (c2,G 2 )), is the aug- 
mented clause (resolve(c u C2), stab(ci, Gi) n W n ). 

Although this definition is stronger than one involving intersection, note that we might 
5 have 

(c,G)\=(d,G') 

but not have 

resolve((c, G), (d, H)) \= resolve((c', G'), {d, H)) 

The reason is that if c = d but G r < G, the image of c under G may be larger than the 
10 image under G', making the requirement of stability for images under G more stringent 
than the requirement of stability for images under G'. We know of no examples where this 
phenomenon occurs in practice, although one could presumably be constructed. 

Proposition 4.14 resolve((c u G),(c 2 ,G)) = (resolve(c u C2),G). 

Proposition 4.15 resolve((c u G x ), (c 2 , G 2 )) |= (resolve(c u c 2 ), Gi D G 2 ). 

15 There is a variety of additional remarks to be made about Definition 4.13. First, the 

resolvent of two augmented clauses can depend on the choice of the representative elements 
in addition to the choice of subgroup of extn(cj,Gi). Thus, if we resolve 

(lu(hh)) (20) 

with 

20 Hi,l) (21) 

we get a contradiction. But if we rewrite (20) so that we are attempting to resolve (21) with 

(MW2)) 

no resolution is possible at all. 

We should also point out that there are computational issues involved in either finding a 

25 subgroup of extn(c*, Gi) or evaluating the specific subgroup stab(cj, Gj). If the component 

groups Gi and G 2 are described by listing their elements, an incremental construction is pos- 
sible where generators are gradually added until it is impossible to extend the group further 
without violating Definition 4.8 or Definition 4.13. But if Gi and G 2 are described only in 
terms of their generators as suggested by Proposition 3.6, computing either stab(cj, Gi) or 

30 a maximal subgroup of extn(ci, Gi) involves the following computational subtasks: 

1. Given a group G and set G, find the subgroup G f < G of all g e G such that G ; (G) = G. 
This set is easily seen to be a subgroup and is called the set stabilizer of G. It is often 
denoted G{c>. 
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2. Given a group G and set C, find the subgroup G' < G of all g G G such that g(c) = c 
for every cgC. This set is also a subgroup and is called the pointwise stabilizer of C. 
It is typically denoted G c , 

3. Given two groups G\ and G 2 described in terms of generators, find a set of generators 
for d n G 2 . 

4. Given G and C, let w G Now u;|c, the restriction of u; to C, makes sense because 
w(C) . = C. Given a p that is such a restriction, find an element p f € G such that 
ff\c = P- 

Although the average case complexity of the above operations appears to be polynomial, 
the worst case complexity is known to be polynomial only for the second and fourth. The 
worst case for the other two tasks is unknown but is generally believed not to be in P (as 
usual, in terms of the number of generators of the groups, not their absolute size). 

In the introduction, we claimed that the result of resolution was unique using reasons 
and that, "The fundamental inference step in zap is in NP with respect to the zap represen- 
tation, and therefore has worst case complexity exponential in the representation size (i.e., 
polynomial in the number of Boolean axioms being resolved). The average case complexity 
appears to be low-order polynomial in the size of the ZAP representation." The use of rea- 
sons breaks the ambiguity surrounding (20) and (21), and the remarks regarding complexity 
correspond to the computational observations just made. 

4.2 Introduction of new groups 

There is another type of inference that is possible in the zap setting. Suppose that we have 
derived 

a V6 

and 

a Vc 

or, in augmented form, 

(aVM) (22) 

and 

(a V c, 1) (23) 
Now we would like to be able to replace the above axioms with the single 

(aV6,(6c)) (24) 

Not surprisingly, this sort of replacement will underpin our eventual proof that augmented 
resolution can p-simulate extended resolution. 
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Definition 4.16 Let S = {(c^G*)} be a set of augmented clauses. We will say that an 
augmented clause (c, G) follows from S by introduction if every instance of (c, G) is an 
instance of one of the (q, Gj). 

Lemma 4.17 Let S be a set of augmented clauses. Then an augmented clause (c, G) follows 
from S by introduction if there is a single (co, Go) £ S such that c is an instance of (co, G 0 ) 
and G < Go. 

Note that the converse does not hold. If S = {(a, (abc))} and the augmented clause (c, G) 
is (a, (a&)), then (c, G) has as instances a and 6, each of which is an instance of the single 
augmented clause in 5. But the group generated by (ab) is not a subgroup of the group 
generated by (abc). 

There is one additional technique that we will need. Suppose that we know 



Can we do this via introduction? 

We would like to, but we cannot. The reason is that the instances of (27) do not actually 
appear in (25) or (26); a V b V x is not an instance of a V x, but is instead a weakening of it. 
(This definition describes weakening the clausal part of an augmented clause; weakening the 
group by restricting to a subgroup is covered by introduction as described in Lemma 4.17.) 

Definition 4.18 An augmented clause (d, G) is a weakening of an augmented clause (c, G) 
if d is a superset of c. 

It is known that proof lengths under resolution or extended resolution do not change 
if weakening is included as an allowable inference. (Roughly speaking, the literals intro- 
duced during a weakening step just have to be resolved away later anyway.) For augmented 
resolution, this is not the case. 

As we will see in the next section, introduction of new groups is equivalent to the intro- 
duction of new variables in extended resolution. Unlike extended resolution, however, where 
it is unclear when new variables should be introduced and what those variables should be, 
the situation in zap is clearer because the new groups are used to collapse the partitioning 
of a single augmented clause. One embodiment of zap itself does not include introduction 
as an inference step. 



aV x 



(25) 




(26) 



(0V6V2, (xy)) 



(27) 
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5 Examples and proof complexity 

Let us now turn to the examples that we have discussed previously: first-order axioms that 
are quantified over finite domains, along with the standard examples from proof complexity, 
including pigeonhole problems, clique coloring problems and parity constraints. For the first, 
we will see that our ideas generalize conventional notions of quantification while providing 
additional representational flexibility in some cases. For the other examples, we will present 
a ground axiomatization, recast it using augmented clauses, and then give a polynomially 
sized derivation of unsatisfiability using augmented resolution. Finally, we will show that 
augmented resolution can p-simulate extended resolution. 

5.1 Lifted clauses and QPROP 

To deal with lifted clauses, suppose that we have a quantified clause such as 

Vxyz.a(x, y) V b(y, z) V c(z) (28) 

We will assume for simplicity that the variables have a common domain D, so that a ground- 
ing of the clause (28) involves working with a map that takes a pair of elements d\, d 2 of D 
and produces the ground variable corresponding to a(d 1 ,d 2 ). In other words, if V is the set 
of variables in our problem, there is an injection 

a : D x D ->V 

In a similar way, there are injections 

b:DxD-+V 

and 

c:D-+V 

where the images of a, b and c are disjoint and each is an injection because distinct relation 
instances must be mapped to distinct ground atoms. 

Now given a permutation u of the elements of D, it is not hard to see that u) induces a 
permutation u x on V given by: 

w M = (flMaf),»), ifv = a(x,y); 
x \v; otherwise. 

In other words, there is a mapping x from the set of permutations on D to the set of 
permutations on V: 

x : Sym(D) Sym(V) 

Definition 5.1 Let G and H be groups and f :G -> H a function between them, f will be 
called a homomorphism if it respects the group operations in that /(pi^) = /(tfi)/(<72) o,nd 

f(g- l ) = f(9)- 1 - " 
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Proposition 5.2 x : Sym(jD) -> Sym(V) is an injection and a homomorphism. 

In other words, x makes a "copy" of Sym(D) inside of Sym(^) corresponding to permuting 
the elements of ar's domain. 

In a similar way, we can define homomorphisms y and z given by 

{a{x,u>{y)) t \fv = a(x,y); 

b{u{y),z), if v = b{y,z)i 

v; otherwise. 

and 

\KvMz)\ if v = b{y,z); 

Uz{v) = < c(w(z)), if v = c(z)- 

[v; otherwise. 

Now consider the subgroup of Sym(V) generated by u s (Sym(Z?)), w y (Sym(D)) and 
u> z (Sym(D)). It is clear that the three subgroups commute with one another, and that 
their intersection is only the trivial permutation. This means that z, y and z collectively 
inject the product D x D x D into Sym(V); we will denote this by 

xyz : D 3 -> Sym(F) 

and it should be clear that the original quantified axiom (28) is equivalent to the augmented 
axiom 

(a(A, B) V b(B, C) V c(C), xyz{D z )) 

where A, B and C are any (not necessarily distinct) elements of D. The quantification is 
exactly captured by the augmentation. 

An interesting thing is what happens to resolution in this setting: 

Proposition 5.3 Let p and q be quantified clauses such that there is a term t p in p and 
~^t q in q where t p and t q have common instances. Suppose also that (p g , P) is an augmented 
clause equivalent to p and {q g ,Q) is an augmented clause equivalent to q, with p g and q g 
resolving nontrivially. Then if no other terms in p and q have common instances, the result 
of resolving p and q in the conventional lifted sense is equivalent to resolve((p g ,P), (q g ,Q)). 

Note that the condition requiring lack of commonality of ground instances is necessary; 
consider resolving 

a(x) V b 

with 

a(y)V-6 

In the quantified case, we get 

\txy.a(x) V a(y) (29) 
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30 In the augmented case, it is not hard to see that if we resolve (a(^4) V6, G) with (a(^4) V-i&, G) 
we get 

(a(A),G) 

corresponding to 

Vx.a(x) (30) 
5 while if we choose to resolve (a (A) V 6, G) with (a(B) V G), we get instead 

Vx ^ y.a{x) V a(y) 

It is not clear which of these representations is superior. The conventional (29) is more 
compact, but obscures the fact that the stronger (30) is entailed as well. This particular 
example is simple, but other examples involving longer clauses and some residual unbound 
10 variables can be more complex. 

5,2 Proof complexity without introduction 
5.2.1 Pigeonhole problems 

Of the examples known to be exponentially difficult for conventional resolution-based sys- 
tems, pigeonhole problems are in many ways the simplest. As usual, we will denote by p y 

15 the fact that pigeon i (of n + 1) is in hole j of n, so that there are n 2 + n variables in the 
problem. We denote by G the subgroup of W„2 + „ that allows arbitrary exchanges of the 
n + 1 pigeons or the n holes, so that G is isomorphic to 5 n+ i x S n . This is the reason that 
this particular example will be so straightforward: there is a single global group that we will 
be able to use throughout the entire analysis. 

20 Our axibmatization is now: 

H>n V-p2i,G) (31) 
saying that no two pigeons can be in the same hole, and 

(p n V-.-Vpm,G) (32) 

saying that the first (and thus every) pigeon has to be in some hole. 

25 Proposition 5.4 There is an augmented resolution proof of polynomial size of the mutual 

unsatisfiability of (31) and (32). 

Proposition 5.5 Any implementation of Procedure SAO that branches on positive literals 
in unsatisfied clauses on line 14 will produce a proof of polynomial size of the mutual unsat- 
isfiability of (31) and (32), independent of specific branching choices made. 

30 This strikes us as a remarkable result: Not only is it possible to find a proof of polynomial 

length in the augmented framework, but in the presence of unit propagation, it is difficult 
not to! 
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5.2.2 Clique coloring problems 

The pigeonhole problem is difficult for resolution but easy for many other proof systems; 
clique coloring problems are difficult for both resolution and other approaches such as pseudo- 
Boolean axiomatizations. 

The clique coloring problems are derivatives of pigeonhole problems where the exact 
nature of the pigeonhole problem is obscured. Somewhat more specifically, they say that a 
graph includes a clique of n + 1 nodes (where every node in the clique is connected to every 
other), and that the graph must be colored in n colors. If the graph itself is known to be a 
clique, the problem is equivalent to the pigeonhole problem. But if we know only that the 
clique can be embedded into the graph, the problem is more difficult. 

To formalize this, we will use e ij? to describe the graph, to describe the coloring of the 
graph, and qij to describe the embedding of the cliQue into the graph. The graph has m 
nodes, the clique is of size n + 1, and there are n colors available. The axiomatization is: 



Here means that there is an edge between graph nodes i and j, Cij means that graph node 
i is colored with the jth color, and means that the zth element of the clique is mapped to 
graph node j. Thus the first axiom (33) says that two of the m nodes in the graph cannot 
be the same color (of the n colors available) if they are connected by an edge. (34) says 
that every graph node has a color. (35) says that every element of the clique appears in the 
graph, and (36) says that no two elements of the clique map to the same node in the graph. 
Finally, (37) says that the clique is indeed a clique - no two clique elements can map to 
disconnected nodes in the graph. As in the pigeonhole problems, there is a global symmetry 
in this problem in that any two nodes, clique elements or colors can be swapped. 

Proposition 5.6 There is an augmented resolution proof of polynomial size of the mutual 
unsatisfiability of (33)-(37). 

5.2.3 Parity constraints 

Rather than discuss a specific example here, we can show the following: 

Proposition 5.7 Let C be a theory consisting entirely of parity constraints. Then deter- 
mining whether or not C is satisfiable is in P for augmented resolution. 

Definition 5.8 Let S be a subset of a set ofn variables. We will denote by F s that subset 
of W n consisting of all permutations that leave the variable order unchanged and flip an even 
number of variables in S. 



-.e^ V ^cn V ^c jt 

Cil V • • • V Cin 

Qn V • • • V q im 



for 1 < i < j < m, i = 1, . . . , n 

for % = 1, . . . ,m 

for i = l,...,n + 1 

for 1 < i < k < n + 1, j = l,...,m 

forl<i<j<ra, l<A:^/<n + l 



(33) 
(34) 
(35) 
(36) 
(37) 
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Lemma 5.9 F s < W n . 

Lemma 5.10 Let S = {x\,...,Xk} be a subset of a set of n variables, 
constraint 

5> = i 

is equivalent to the augmented constraint 

(xi V---Vx k ,F s ) 

The parity constraint 

5> = 0 

is equivalent to the augmented constraint 

The construction in the proof fails in the case of modularity constraints with a base other 
than 2. One of the (many) problems is that the set of permutations that flip a set of variables 
of size congruent to m (mod n) is not a group unless m = 0 and n < 3. We need m = 0 for 
the identity to be included, and since both 

and 

are included, it follows that 

must be included, so that n = 1 or n = 2. 

5.3 Extended resolution and introduction 

We conclude this section by describing the relationship between our methods and extended 
resolution. As a start, we have: 

Definition 5.11 An extended resolution proof for a theory T is one where T is first aug- 
mented by a collection of sets of axioms, each set of the form 

-ixi V • • • V ->x n V w 
Xi V 

■ ; (40) 
x n V -W 

where Xi,...,rc n are literals in the (possibly already extended) theory T and w is a new 
variable. Derivation then proceeds using conventional resolution on the augmented theory. 



Then the parity 
(38) 



(39) 
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The introduction defines the new variable w as equivalent to the conjunction of the x^s. 
This suffices to allow the definition of an equivalent to any subexpression, since de Morgan's 
laws can be used to convert disjunctions to conjunctions if need be. A conventional definition 
of extended resolution allows only n = 2, since complex terms can be built up from pairs. The 
equivalent definition that we have given makes the proof of Theorem 5.12 more convenient. 

5 Theorem 5.12 

1. If resolution cannot p-simulate a proof system P, neither can augmented resolution 
without introduction. 

2. Augmented resolution with introduction and weakening can p-simulate extended reso- 
lution. 

10 This theorem, together with the results of the previous subsection, support the proof 

complexity claims made for ZAP in the introduction. 

6 Theoretical and procedural description 

In addition to resolution, an examination of Procedures 2.10 and 2.7 shows that we need to 
be able to eliminate nogoods when they are irrelevant and to identify instances of augmented 
15 clauses that are unit. Let us now discuss each of these issues. 

The problems around irrelevance are incurred only when a new clause is added to the 
database and are therefore not in the zap inner loop. Before discussing these in detail, 
however, let us discuss a situation involving the quantified transitivity axiom 

->a(x, y) V ->a(y, z) V a(x, z) 

20 Now if we are trying to derive a(A, B) for an A and a B that are "far apart" given the 

skeleton of the relation a that we already know, it is possible that we derive 

a(A, x) A a(x, B) -> a(A, B) 

as we search for a proof involving a single intermediate location, and then 

a(A, x) A a{x, y) A a(y, B) -> a(A, B) 

25 as we search for a proof involve two such locations, and so on, eventually concluding 

a{A, Xi) A • • • A a(x n , B) a(A, B) (41) 

for some suitably large n. The problem is that if d is the size of our domain, (41) will have 
cpn-2 g roun( j instances and is in danger of overwhelming our unit propagation algorithm even 
in the presence of reasonably sophisticated subsearch techniques. We argued previously that 
30 this problem requires that we learn only a version of (41) for which every instance is relevant. 

A way to do this is as follows: 
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Procedure 6.1 Given a sat problem C and an annotated partial assignment P, to compute 
learn(C, P, (c, G)), the result of adding to C an augmented clause (c, G): 

1 remove from C any (cf, H) for which every instance d* has poss(d', P) > k 

2 {&} <~ the generators of G 

3 S<-0 

4 while there is a g e {&} - S such that Vd G (c, (5 U {#})) . poss(d, P) < k 

5 do select such a g 

6 S<-Su{g} 

7 return CU{(c, (5))} 

The procedure gradually adds generators of the group until it is impossible to add more 
without introducing an irrelevant clause (or perhaps because the entire group G has been 
added). We will defer until below a discussion of a procedure for determining whether 
(c, (S U {g})) has an irrelevant instance or whether (d, H) has only irrelevant instances as 
checked in line 1. 

We will also defer discussion of a specific procedure for computing unit-propagate (P), 
but do include a few theoretical comments at this point. In unit propagation, we have a 
partial assignment P and need to determine which instances of axioms in C are unit. To do 
this, suppose that we denote by S(P) the set of Satisfied literals in the theory, and by U(P) 
the set of Unvalued literals. Now for a particular augmented clause (c, G), we are looking 
for those g e G such that g(c) n S(P) = 0 and \g{c) n U(P)\ < 1. The first condition says 
that g(c) has no satisfied literals; the second, that it has at most one unvalued literal. 

Procedure 6.2 (Unit propagation) To compute Unit-Propagate(P) for an annotated 
partial assignment P = ((l u c x ), . . . , (Z n , c^)); 

1 while there is a (c, G) G C and g € G with g(c) n S(P) = 0 and \g(c) n U(P)\ < 1 



2 do if g(c) n U(P) = 0 

3 then ^ the literal in g(c) with the highest index in P 

4 return (true, resolve((c, G), q)) 

5 else / <r- the literal in g(c) unassigned by P 

6 add (J, ($(c), G)) to P 



7 return (false, P) 

Note that the addition made to P when adding a new literal includes both g(c), the instance 
of the clause that led to the propagation, and the augmenting group as usual. We can use 
G?( c )> G) as the augmented clause by virtue of Proposition 4.2. 
Finally, the augmented version of Procedure 2.10 is: 

Procedure 6.3 (Relevance-bound reasoning, rbl) Given a SAT problem C and an an- 
notated partial assignment P, to compute rbl(C, P); 
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1 if P is a solution to C 

2 then return P 

3 {x, y) <r- unit-propagate(P) 

4 if x = true 

5 then (c, G) y 

6 if c is empty 

7 then return failure 

8 else remove successive elements from P so that c is unit 

9 C ^learn(C,P,(c,<?)) 

10 return rbl(C, P) 

11 else P <r-y 

12 if P is a solution to C 

13 then return P 

14 else I <r- a literal not assigned a value by P 

15 return rbl(C, (P, (/, true))) 



Examining the previous three procedures, we see that we need to provide implementations 
of the following: 

1. A routine that computes the group of stable extensions appearing in the definition of 
augmented resolution, needed by line 4 in the unit propagation procedure 6.2. 

2. A routine that finds instances of (c, G) for which g(c) n S = 0 and \g(c) D U\ < 1 for 
disjoint S and C/, needed by line 1 in the unit propagation procedure 6.2. 

3. A routine that determines whether (c, G) has instances for which poss (#(c),P) > k 
for some fixed h, as needed by line 4 in Procedure 6.1. 

4. A routine that determines whether (c, G) has an instance for which poss(^(c), P) <k 
for some fixed h, as needed by line 1 in Procedure 6.1. 

Our focus below is on the development of efficient procedures that achieve these goals. 

7 Conclusion 

Our aim in this paper has been to give a theoretical description of a generalized representation 
scheme for satisfiability problems. The basic building block of the approach is an "augmented 
clause," a pair (c, G) consisting of a ground clause c and a group G of permutations acting 
on it; the intention is that the augmented clause is equivalent to the conjunction of the 
results of operating on c with elements of G. We argued that the structure present in the 
requirement that G be a group provides a generalization of a wide range of existing notions, 
from quantification over finite domains to parity constraints. 
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20 We went on to show that resolution could be extended to deal with augmented clauses, 

and gave a generalization of relevance-based learning in this setting (Procedures 6.1-6.3). 
We also showed that the resulting proof system generalized first-order techniques when ap- 
plied to finite domains of quantification, and could produce polynomially sized proofs of the 
pigeonhole problem, clique coloring problems, Tseitin's graph coloring problems, and parity 
5 constraints in general. These results are obtained without the introduction of new variables 

or other choice points; a resolution in our generalized system is dependent simply on the 
selection of two augmented clauses to combine. We also showed that if new groups could be 
introduced, the system was at least as powerful as extended resolution. 

Finally, we described the specific group-theoretic problems that need to be addressed in 

10 implementing our ideas. As discussed in the previous section, they are: 

1. Implementing the group operation associated with the generalization of resolution, 

2. Finding unit instances of an augmented clause, 

3. Determining whether an augmented clause has irrelevant instances, and 

4. Determining whether an augmented clause has relevant instances. 
15 We will return to these issues below. 



46 



IMPLEMENTATION 



Case 8585 



1 Introduction 

Our overall plan for describing an embodiment of zap is as follows: 

1. We begin in the next section by presenting both the algorithms to be used and repeating 
the underlying group-theoretic constructs where necessary. 

2. We next describe the group-theoretic computations required by the zap implementa- 
tion. The other elements of the procedures in Section 2 all have analogs in existing 
implementations, and we do not describe them further here. 

3. Sections 4 and 5 describe the implementations of the computations discussed in Sec- 
tion 2. (The intervening section 3 gives a brief introduction to some of the ideas in 
computational group theory that we use.) For each basic construction, we describe the 
algorithm used and give an example of the computation in action. 

After describing the implementation, we describe the interface to one embodiment in Sec- 
tion 6. 

2 ZAP fundamentals and basic structure 

Let us begin not with ZAP, but with a description of the architecture of modern Boolean 
satisfiability engines. We start with the unit propagation procedure, which we describe as 
follows: 

Definition 2.1 Given a Boolean satisfiability problem described in terms of a set C of 
clauses, a partial assignment is an assignment of values (true or false) to some subset of 
the variables appearing in C. We represent a partial assignment P as a sequence of literals 
P = (li) where the appearance of Vi in the sequence means that V{ has been set to true, and 
the appearance of ~^Vi means that Vi has been set to false. 

An annotated partial assignment is a sequence P = {(li.Ci)) where q, is the reason for 
the associated choice li. If Ci — true, it means that the variable was set as the result of a 
branching decision; otherwise, Ci is a clause that follows from C and such that it entails U 
by virtue of the choices of the previous lj for j < i. (See above for additional details.) 

Given a (possibly annotated) partial assignment P, we denote by S(P) the literals {k} 
that are satisfied by P, and by U(P) the set of literals that are unvalued by P. 

Procedure 2.2 (Unit propagation) To compute Unit-Propagate(P) for an annotated 
partial assignment P — ((l u ci), . . . , (J„, d)): 



47 



Case 8585 



1 while there is a c e C with c D S{P) = 0 and |c n t/(P)| < 1 

2 do if cnt/(P) = 0 

3 then «— the literal in c with the highest index in P 

4 return (true, resolve(c,Ci)) . 

5 else Z <— the literal in c unassigned by P 

6 P<-PU(/,c) 

7 return (false, P) 

The result returned depends on whether or not a contradiction was encountered during 
the propagation, with the first result returned being true if a contradiction was found and 
false if none was found. In the former case (where the clause c has no unvalued literals in 
line 2), then if c is the clause and is the reason that the last variable was set in a way that 
caused c to be unsatisfiable, we resolve c with q and return the result as a new nogood for 
the problem in question. Otherwise, we eventually return the partial assignment, augmented 
to include the variables that have been set during the propagation process. 

Given unit propagation, the overall inference procedure is the following: 
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Procedure 2.3 (Relevance-bound reasoning, rbl) Given a SAT problem C and an an- 
notated partial assignment P, to compute RBL (C,P): 

if P is a solution to C 

then return P 
(rc, y) <- unit-propagate(P) 
if x = true 
then c «— y 

if c is empty 

then return failure 

else remove successive elements from P so that c is unit 
C <- learn(C, P, c) 
return rbl(C, P) 

else P <- y 

if P is a solution to C 
then return P 

else / <— a literal not assigned a value by P 
return rbl(C, (P, (i, true))) 

The procedure is recursive. If at any point we have solved the overall problem, we're done. 
5 Otherwise, we propagate with unit propagation; if a contradiction is found and a clause c is 

returned, we use the (currently unspecified) learn procedure to incorporate c into the solver's 
current state, and then recurse. (If c is empty, it means that we have derived a contradiction 
and the procedure fails.) In the backtracking step (line 8), we backtrack not just until c is 
satisfiable, but until it enables a unit propagation. This leads to increased flexibility in the 
10 choice of variables to be assigned after the backtrack is complete, and generally improves 

performance. 

If unit propagation does not indicate the presence of a contradiction, we pick an unvalued 
literal, set it to true, and recurse again. Note that we don't need to set the literal I to true 
or false; if we eventually need to backtrack and set / to false, that will be handled by the 
15 modification to P in line 8. 

Finally, we need to present the procedure used to incorporate a new nogood into the 
clausal database C. In order to do that, we need to make the following definition: 

Definition 2.4 Let be a clause, which we will denote by c, and let P be a partial 
assignment We will say that the possible value of c under P is given by 

20 poss(c,P) = |{*K0P}|-l 

// no ambiguity is possible, we will write simply poss(c) instead of poss (c, P). In other 
words, poss(c) is the number of literals that are either already satisfied or not valued by P, 
reduced by one (since the clause requires one literal true be true). 
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Note that poss(c,P) = \cD[U(P) U S(P)]\ - 1, since each expression is one less than the 
number of potentially satisfiable literals in c. 

The possible value of a clause is essentially a measure of what other authors have called its 
irrelevance. A clause c with poss(c, P) = 0 can be used for unit propagation; if poss(c, P) = 
5 1, it means that a change to a single variable can lead to a unit propagation, and so on. The 

notion of learning used in relevance-bounded inference is now captured by: 

Procedure 2.5 Given a SAT problem C and an annotated partial assignment P, to compute 
learn(C, P, c), the result of adding to C a clause c and then removing irrelevant clauses: 

1 remove from C any d € C with poss(d, P) > k 

2 return Cu{c} 

10 In zap, we continue to work with these procedures in approximately their current form, 

but replace the idea of a clause (a disjunction of literals) with that of an augmented clause: 

Definition 2.6 An augmented clause in an n-variable Boolean satisfiability problem is a 
pair (c, G) where c is a Boolean clause and G < W n . A (nonaugmented) clause d is an 
instance of an augmented clause (c, G) if there is some g EG such that d = g(c). 

15 Roughly speaking, an augmented clause consists of a conventional clause and a group G 

of permutations acting on it; the intent is that we can act on the clause with any element 
of the group and still get a clause that is "part" of the original theory. The group G is 
required to be a subgroup of W n = S 2 I 5 n , which means that each permutation g € G 
can permute the variables in the problem and flip the signs of an arbitrary subset as well. 

20 We showed previously that suitably chosen groups correspond to cardinality constraints, 

parity constraints (the group flips the signs of any even number of variables), and universal 
quantification over finite domains. 

We must now lift the previous three procedures to an augmented setting. The first two 
are straightforward. In unit propagation, for example, instead of checking to see if any clause 

25 c € C is unit given the assignments in P, we must now check to see if any augmented clause 

(c, G) has a unit instance. Other than that, the procedure is essentially unchanged from 
Procedure 2.2: 

Procedure 2.7 (Unit propagation) To compute Unit-Propagate(P) for an annotated 
partial assignment P = ((ii, ci), . . . , (i n , Cn)): 

1 while there is a (c, G) E C and geG with g(c) n S(P) = 0 and \g(c) n U(P)\ < 1 



2 do if g(c) n U(P) = 0 

3 then l t <— the literal in g(c) with the highest index in P 

4 return (true, resolve((^(c), G), c,)) 

5 else / «- the literal in g(c) unassigned by P 

6 add (J, (s(c), G)) to P 



30 7 return (false, P) 
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The basic inference procedure itself is also virtually unchanged: 

Procedure 2.8 (Relevance-bound reasoning, rbl) Given a sat problem C and an an- 
notated partial assignment P, to compute RBl(C, P): 



1 if P is a solution to C 

2 then return P 

3 (x, y) <- unit-propagate(P) 

4 if x = true 

5 then (c, G) <- j/ 

6 if c is empty 

7 then return failure 

8 else remove successive elements from P so that c is unit 

9 C^learn((7,P,(c,G)) 

10 return rbl(C, P) 

11 else P «- y 

12 if P is a solution to C 

13 then return P 

14 else I 4- a literal not assigned a value by P 

15 return rbl(C, (P, (/, true))) 



In line 5, although unit propagation returns an augmented clause (c, G), the instance c is still 
the reason for the backtrack; it follows that line 8 is unchanged from the Boolean version. 
The tricky part is the learning procedure 2.5, which becomes: 

Procedure 2.9 Given a SAT problem C and an annotated partial assignment P, to compute 
learn(C, P, (c, G)), the result of adding to C an augmented clause (c, G): 

1 remove from C any (<2, H) for which every instance d' has poss(d', P) > k 

2 {gi} <- the generators of G 

3 5^0 

4 while there is a g e {&} - S such that Vd e (c, (5 U {g})) . poss(d, P) < k 

5 do select such a g 

6 5f-5u{(/} 

7 return C(j{(c, <S>)} 

There are two differences from the Boolean case. First, in line 1, we remove not just 
clauses that are irrelevant, but augmented clauses for which every instance is irrelevant. 
Presumably, it will be useful to retain the clause as long as it has some relevant instance. 

More interesting is the difference in lines 2-6. As we pointed out above, we generally do 
not want to learn the most general possible resolvent of two existing clauses, since such a 
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resolvent will have many irrelevant instances that slow the search for instances with specific 
properties in line 1 of the unit propagation procedure 2.7. 

Previously, we showed that a proof engine built around the above three procedures would 
have the following properties: 

• Since the number of generators of a group is logarithmic in the group size, it would 
achieve exponential improvements in basic representational efficiency. 

• Since only fc-relevant nogoods are retained as the search proceeds, the memory require- 
ments are polynomial in the size of the problem being solved. 

• It can produce polynomially sized proofs of the pigeonhole and clique coloring problems, 
and any parity problem. 

• If we allow the procedures to be augmented in a way that allows the identification and 
introduction of new groups, it can p-simulate extended resolution. 

• It generalizes first-order inference provided that all quantifiers are universal and all 
domains of quantification are finite. 

We stated without proof that the unit propagation procedure 2.7 can be implemented in a 
way that generalizes both subsearch and the watched literal idea. 

2.1 Group-theoretic elements 

Examining the above three procedures, the elements that are new relative to Boolean engines 
are the following: 

1. In line 1 of the unit propagation procedure 2.7, we need to find unit instances of an 
augmented clause (c, G). 

2. In line 4 of the same procedure 2.7, we need to compute the resolvent of two augmented 
clauses. 

3. In line 1 of the learning procedure 2.9, we need to determine if an augmented clause 
has any relevant instances. 

4. In line 4 of the learning procedure, we need to determine if an augmented clause has 
any irrelevant instances. 

The first, third and fourth of these needs are different from the second. For resolution, 
we need the following definition: 

Definition 2.10 For a permutation p and set S with p(S) = S, by p\ s we will mean the 
restriction ofp to the given set, and we will say thatp is a lifting ofp\s back to the original 
set on which p acts. 
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Definition 2.11 For K { C L and Gi < Sym(L), we will say that a permutation u e Sym(L) 
is a stable extension of {d} if there are g { e Gi such that for all i, uj\ Gi (Ki) = ftlc^). We 
will denote the set of stable extensions of {Gi} by stab^, Gi), 

The set of stable extensions stab(K u Gi) is closed under composition, and is therefore a 
subgroup of Sym(L). 

Definition 2.12 Suppose that (c u Gi) and (c 2 , G 2 ) are augmented clauses. Then the (canon- 
ical) resolvent of(c u Gi) and (c 2 ,G 2 ) ; to be denoted by resolve((c u Gi), (c 2 ,G 2 )), is the aug- 
mented clause (resolve(c u c 2 ), stab(cj, Gi) n W n ). 

It follows from the above definitions that computing the resolvent of two augmented 
clauses as required by Procedure 2.7 is essentially a matter of computing the set of stable 
extensions of the groups in question. We will return to this problem in Section 4. 

The remaining problems can all be viewed as instances of the following: 

Definition 2.13 Let cbe a clause, viewed as a set of literals, and G a group of permutations 
acting on c. Now fix sets of literals S and U, and an integer k. We will say that the k- 
transporter problem is that of finding a g e G such that g(c) D S = 0 and \g(c) n U\ < k, 
or reporting that no such g exists. 

All of the remaining problems we have discussed are instances of the ^-transporter prob- 
lem. To find a unit instance of (c, G), we set S to be the set of satisfied literals and U the 
set of unvalued literals. Taking k = 1 implies that we are searching for an instance with no 
satisfied and at most one unvalued literal. 

To find a relevant instance, we set S = 0 and U to be the set of all satisfied or unvalued 
literals. Taking k to be the relevance bound corresponds to a search for a relevant instance. 

To find an irrelevant instance, we continue to set S = 0, and take U to be the set of all 
unsatisfied literals. Now if the relevance bound is r, we set k = \c\ - r - 1. After all, if the 
clause contains at most |c| — r — 1 unsatisfied literals, it will contain at least r + 1 satisfied 
or unvalued literals, so that poss(#(c)) > r and the instance will be irrelevant. 

The remainder of the theoretical material in this paper is therefore focused on these two 
problems: Computing the stable extensions of a pair of groups, and solving the ifc-transporter 
problem. Before we discuss the techniques used to solve these two problems, we present a 
brief overview of computational group theory generally. 

3 Computational group theory 

Background information on group theory can be found in Rotman's An Introduction to the 
Theory of Groups and Seress' Permutation Group Algorithms, both of which are hereby 
incorporated by reference herein. 

Our goal here is to provide enough general understanding of computational group theory 
that it will be possible to work through some examples in what follows. With that in mind, 
there are three basic ideas that we convey: 
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1. Stabilizer chains. These underlie the fundamental technique whereby large groups are 
represented efficiently. They also underlie many of the subsequent computations done 
using those groups. 

2. Coset decompositions. Given a group G and a subgroup H < G, the cosets of H 
partition G. Each of the cosets can itself be partitioned using a subgroup of H , and so 
on; this gradual refinement underpins many of the search-based group algorithms that 
have been developed. 

3. Lex-leader search. In general, it is possible to establish a lexicographic ordering on the 
elements of a permutation group; if we are searching for an element of the group having 
a particular property (as in the /c-transporter problem), we can assume without loss 
of generality that we are looking for an element that is minimal under this ordering. 
This often allows the search to be pruned, since any portion of the search that can be 
shown not to contain such a minimal element can be eliminated. 

3.1 Stabilizer chains 

While the fact that a group G can be described in terms of an exponentially smaller number 
of generators {#} is attractive from a representational point of view, there are many issues 
that arise if a large set of clauses is represented in this way. Perhaps the most fundamental 
is that of simple membership: How can we tell if a fixed clause d is an instance of the 
augmented clause (c, G)l 

In general, this is an instance of the O-transporter problem; we need some g G G for 
which g(c), the image of c under does not intersect the complement of d. A simpler but 
clearly related problem assumes that we have a fixed permutation g such that g(c) = d\ is 
jGGor not? Given a representation of G in terms simply of its generators, it is not obvious 
how this can be determined quickly. 

Of course, if G is represented via a list of all of its elements, we could sort the elements 
lexicographically and use a binary search to determine if g were included. Virtually any 
problem of interest to us can be solved in time polynomial in the size of the groups involved, 
but we would like to do better, solving the problems in time polynomial in the number 
of generators, and therefore polynomial in the logarithm of the size of the groups (and so 
polylog in the size of the original clausal database). We will call a procedure polynomial 
only if it is indeed polytime in the number of generators of G and in the size of the set of 
literals on which G acts. It is only for such polynomial procedures that we can be assured 
that zap's representational efficiencies will mature into computational gains. 

For the membership problem, that of determining if g G G given a representation of G 
in terms of its generators, we need to have a coherent way of understanding the structure of 
the group G itself. If we suppose that G is a subgroup of the group Sym(L) of symmetries 
of some set L, we can enumerate the elements of L as L = {l u . . . , l n }. 

There will now be some subset of C G that fixes h in that for any h G we have 
h(h) = ii- Prom this point forward, we will generally use notation that has become popular 
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in the computational group theory community, and write l± for the image of li under h, so 
that the condition in question requires that l± = l\. The reason for the representational 
shift is that it corresponds naturally to the fact that the composition of two groups elements 
fg acts with / first and then with g, since f(g(x)) is now written x fg . The fact that 
9{f{ x )) — (f9){%) (note the awkward variable order in this representation) now becomes the 
natural x Sg = (x f ) 9 (note the normal variable order). 

It is easy to see that G^ is closed under composition, since if any two elements fix l u 
then so does their composition. It follows that G® is actually a subgroup of G. In fact, we 
have: 

Definition 3.1 Given a group G and set L, the pointwise stabilizer of L is the subgroup 
G f < G of all g G G such that I 9 = I for every I G L, and will be denoted G L . The set 
stabilizer of L is that subgroup G f < G of all g G G such that L 9 = L, and will be denoted 
G {L} . 

Having defined G^ as the point stabilizer of l u we can go on to define G^ as the point 
stabilizer of l 2 within G [2 1, so that G^ is in fact the pointwise stabilizer of {luk} in G. 
Similarly, we define to be the pointwise stabilizer of U in and thereby construct a 
chain of stabilizers 

G = G^ > G [2] > • > G [nl = 1 

where the last group is necessarily trivial because once n - 1 points of L are stabilized, the 
last point must be also. 

If we want to describe G in terms of its generators, we will now assume that we describe 
all of the CrM in terms of generators, and furthermore, that the generators for are a 
superset of the generators for G^ m h We can do this because G [<+11 is a subgroup of G®. 

Definition 3.2 A strong generating set S for a group G C Sym{/ l5 . . . , /„} is a set of 
generators for G with the property that 

(SnGM) = GM 

for i = l,...,n. 

As usual, (gi) denotes the group generated by the g im 

It is easy to see that a generating set is strong just in case it has the property discussed 
above, in that each G® can be generated incrementally from G^ +1 l and the generators that 
are in fact elements of G® - 

As an example, suppose that G = 5 4 , the symmetric group on 4 elements (which we will 
denote 1,2,3,4). Now it is not hard to see that 5 4 is generated by the 4-cycle (1,2,3,4) 
and the transposition (3,4), but this is not a strong generating set. G® is the subgroup of 
£ 4 that stabilizes 1 (and is therefore isomorphic to 5 3 , since it can randomly permute the 
remaining three points) but 

(5nGl 2 l) = ((3,4)> = G [3 ^G[ 2 l (1) 
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If we want a strong generating set, we need to add (2, 3,4) or a similar permutation to the 
generating set, so that (1) becomes 

(5nGt 2 ]) = ((2,3 ) 4),(3,4)) = Gt 2 ] 

Here is a slightly more interesting example. Given a permutation, it is always possible to 
write that permutation as a composition of transpositions. One possible construction maps 
1 where it is supposed to go, then ignores it for the rest of the construction, and so on. Thus 
we have for example 

(1,2,3,4) = (1,2)(1,3)(1, 4) (2) 

where the order of composition is from left to right, so that 1 maps to 2 by virtue of the first 
transposition and is then left unaffected by the other two, and so on. 

While the representation of a permutation in terms of transpositions is not unique, the 
parity of the number of transpositions is; a permutation can always be represented as a 
product of an even or an odd number of transpositions, but not both. Furthermore, the 
product of two transpositions of lengths h and / 2 can obviously be represented as a product 
of length li + 1 2 , and it follows that the product of two "even" transpositions is itself even, 
and we have: 

Definition 3.3 The alternating group of order n, to be denoted by A n , is the subgroup of 
even transpositions of S n . 

What about a strong generating set for A n ? If we fix the first n - 2 points, then the 
transposition (n - 1, n) is obviously odd, so we must have A^~ 1 ^ = 1, the trivial group. For 
any smaller z, we can get a subset of A n by taking the generators for S$ and operating on 
each as necessary with the transposition (n - 1, n) to make it even. An n-cycle is odd if and 
only if n is even (consider (2)), so given the strong generating set 

{(n-l,n),(n-2,n-l,n),...,(2,3,...,n),(l,2,...,n)} 

for 5 n , a strong generating set for A n if n is odd is 

{(n-2,n-l,n),(n-3,n-2,n^ 

and if n is even is 

{(n-2,n-l,n),(n-3,n-2,n-l,rc)(n^ 
We can simplify these expressions slightly to get 

{(n-2,n-l,n),(n-3,n-2,n-l),...,(2,3,...,n-l),(l,2,...,n)} 
if n is odd and 

{(n-2,n- l,n),(n-3,n-2,n-l),...,(2,3,...,n),(l,2,...,n-l)} 
if n is even. 

Given a strong generating set, it is possible to compute the size of the original group G. 
To do this, we need the following well known definition and result: 



56 



Case 8585 



Definition 3.4 Given groups H < G and g € G, we define Eg to be the set of all hg for 
he H. For any such g, we will say that Hg is a (right) coset of H in G. 

Proposition 3.5 Let Hg x and Hg 2 be two cosets of H in G. Then \Hg x \ = \Hg 2 \ and the 
cosets are either identical or disjoint 

5 In other words, given a subgroup if of a group G, the cosets of H partition G. This leads 

to: 

Definition 3.6 For groups H <G, the index of H in G } denoted [G : H], is the number of 
distinct cosets of H in G. 

Corollary 3.7 For a finite group G, [G : H] = jgj. 

10 Given that the cosets partition the original group G, it is natural to think of them as 

defining an equivalence relation on G, where x « y if and only if x and y belong to the same 
coset of if. We have: 

Proposition 3.8 x&y if and only if xy~~ l G if. 

Many equivalence relations on groups are of this form. So many, in fact, that given an 
15 equivalence relation on the elements of a group, it is natural to look for a subgroup H such 

that the cosets of if define the equivalence relation. 

Returning to stabilizer chains, let us denote by If w the orbit of U under G w (i.e, the set 
of all points to which maps k). We now have: 

Proposition 3.9 Given a group G acting on a set {k} and stabilizer chain G w , 

20 . \G\=n\e ] \ (3) 



Note that the expression in (3) is easy to compute given a strong generating set. As an 
example, given the strong generating set {(1,2, 3, 4), (2, 3, 4), (3, 4)} for S 4 , it is clear that 
4 3] = ((3,4)) and the orbit of 3 is of size 2. The orbit of 2 in Sf ] = ((2, 3,4), (3, 4)) is of 
size 3, and the orbit of 1 in is of size 4. So the total size of the group is 4! = 24. 

25 For A,, the strong generating set is {(1, 2, 3, 4)(3, 4), (2, 3, 4)} = {(1, 2, 3), (2, 3, 4)}. The 

orbit of 2 in Af = ((2, 3, 4)) is clearly of size 3, and the orbit of 1 in A$ ] = A 4 is of size 4. 
So |A 4 | = 12. In general, there are exactly two cosets of the alternating group because all of 
the odd permutations can be constructed by multiplying the even permutations in A n by a 
fixed transposition t. Thus \A n \ = n!/2. 

30 We can evaluate the size of A n using strong generators by realizing that the orbit of 1 is 

of size n, that of 2 is of size n — 1, and so on, until the orbit of n - 2 is of size 3. The orbit of 
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n - 1 is of size 1, however, since the transposition (n - 1, n) is not in A n . Thus \A n \ = n!/2 
as before. 

We can also use the strong generating set to test membership in the following way. 
Suppose that we have a group G described in terms of its strong generating set (and therefore 
5 its stabilizer chain and a specific permutation u. Now if o>(l) = k, there are two 

possibilities: 

1. If k is not in the orbit of 1 in G = G^\ then clearly cj & G. 

2. If k is in the orbit of 1 in select g x G with I 91 = g x (l) = k. Now we construct 
^i = wgi 1 , which fixes 1, and we determine recursively if uji G G^. 

10 At the end of the process, we will have stabilized all of the elements moved by G, and 

should have u; n+1 = 1. If so, the original w G G; if not, u & G. This procedure (which we 
will revisit in Section 4) is known as sifting. 

Continuing with our example, let us see if the 4-cycle u) = (1, 2, 3, 4) is in S 4 and in j4 4 . 
For the former, we see that = 2 and (1,2,3,4) G S| 1] . This produces w x = 1, and we 

15 can stop and conclude that uj G S 4 . 

For the second, we know that (1, 2, 3) G ^ 1] and we get ui = (1, 2, 3, 4)(1, 2, 3)" 1 = (3, 4). 
Now we could actually stop, since (3,4) is obviously odd, but let us continue with the 
procedure. Since 2 is fixed by lj u we have oj 2 = wi- Now 3 is moved to 4 by u 2 , but A$ is 
the trivial group, so we conclude correctly that (1, 2, 3, 4) 0 ^4 4 . 

20 3.2 Coset decomposition 

Some of the group problems that we will be considering (e.g., the ^-transporter problem) 
subsume what was described previously as subsearch. Subsearch is known to be NP-complete, 
so it follows that ^-transporter must be as well. That suggests that the group-theoretic 
methods for solving it will involve search in some way. 

25 The search involves a potential examination of all of the instances of some augmented 

clause (c, G), or, in group theoretic terms, a potential examination of each member of the 
group G. The computational group theory community often approaches such a search prob- 
lem by gradually decomposing G into smaller and smaller cosets. What we will call a partition 
tree is produced, where the root of the tree is the entire group G and the nodes are individual 

30 elements of G: 

Definition 3.10 Let G be a group, and G® a stabilizer chain for it By the partition tree 
for G we will mean a tree whose vertices at the ith level are the cosets of and for which 
the parent of a particular G®g is that coset ofG^' 1 ^ that contains it 

At any particular level z, the cosets correspond to the points to which the sequence (l u . . . , 
35 can be mapped, with the points in the image of U identifying the children of any particular 

node at level i — 1, 
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As an example, suppose that we consider the augmented clause 

(a V6,Sym(a,6,c,cO) (4) 

corresponding to the collection of ground clauses 

aVb 
aVc 
aVd 
6Vc 
b\/ d 
cVd 

Suppose also that we are working with an assignment for which a and b are true and c 
and d are false, and are trying to determine if any instance of (4) is unsatisfied. Assuming 
that we take h to be a through l± = d, the partition tree associated with S 4 is shown in 
Figure 6. 

An explanation of the notation in Figure 6 is in order. The nodes on the lefthand edge are 
labeled by the associated groups; for example, the node at level 2 is labeled with Sym(6, c, d) 
because this is the point at which we have fixed a but 6, c and d are still allowed to vary. 

As we move across the row, we find representatives of the cosets that are being considered. 
So moving across the second row, the first entry (ab) means that we are taking the coset 
of the basic group Sym(6, c, d) that is obtained by multiplying each element by (ab) on the 
right. This is the coset that maps a uniformly to b. 

On the lower rows, we multiply the coset representatives associated with the nodes leading 
to the root. So the third node in the third row, labeled with (bd), corresponds to the coset 
Sym(c, d) • (bd). The two elements of this coset are (bd) and (ed)(bd) = (bdc). The point b is 
uniformly mapped to rf, a is fixed, and c can either be fixed or mapped to 6. 

The fourth point on this row corresponds to the coset 

Sym(c, d) ■ (ab) = {(ab), (cd)(ab)} 

The point a is uniformly mapped to 6, and b is uniformly mapped to a. c and d can be 
swapped or not. 

The fifth point is the coset 

Sym(c, d) • (bc)(ab) = Sym(c, d) • (abc) = {(abe), (abed)} (5) 

a is still uniformly mapped to 6, and b is now uniformly mapped to c. c can be mapped 
either to a or to d. 

For the fourth line, the basic group is trivial and the single member of the coset can be 
obtained by multiplying the coset representatives on the path to the root. Thus the ninth 
and tenth nodes (marked with asterisks in the tree) correspond to the permutations (abc) 
and (abed) respectively, and do indeed partition the coset of (5). 
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Understanding how this structure is used in search is straightforward. At the root, the 
original augmented clause (4) may indeed have unsatisfiable instances. But when we move 
to the first child, we know that the image of a is a, so that the instance of the clause in 
question is a V a; for some x. Since a is true for the assignment in question, it follows that 
5 the clause must be satisfied. In a similar way, mapping a to 6 also must produce a satisfied 
clause. The search space is already reduced to the structure shown in Figure 7. 

If we map a to c, then the first point on the next row corresponds to mapping b to 6, 
producing a satisfiable clause. If we map b to a (the next node), we also get a satisfiable 
clause. If we map b to d, we will eventually get an unsatisfiable clause, although it is not 
10 clear how to recognize that without expanding the two children. The case where a is mapped 
to d is similar, and the final search tree is shown in Figure 8. 

Instead of the six clauses that might need to be examined as instances of the original 
(4), only four leaf nodes need to be considered. The internal nodes that were pruned above 
can be pruned without generation, since the only values that need to be considered for a are 
15 necessarily c and d (the unsatisfied literals in the theory). At some level, then, the above 

search space becomes as shown in Figure 9. 

3.3 Lex leaders 

Although the remaining search space in this example already examines fewer leaf nodes than 
the original, there still appears to be some redundancy. To understand one possible simplifi- 

20 cation, recall that we are searching for a group element g for which g(c) (or, equivalently, c 9 ) 

is unsatisfied given the current assignment. Since any such group element suffices, we can (if 
we wish) search for that group element that is smallest under the lexicographic ordering of 
the group itself. (Where gi < g 2 if gi(a) is earlier in the alphabet than g 2 (a) or gi(a) = g 2 {a) 
and gi(b) < g\(b), and so on.) If we denote by S the set of group elements that have the 

25 property we are searching for, the lexicographically smallest element of S is often called the 

"lexicographic leader" or "lex leader" of S. 

In our example, imagine that there were a solution (i.e., a group element corresponding 
to an unsatisfied instance) under the right hand node at depth three. Now there would 
necessarily also have been an analogous solution under the preceding node at depth three, 

30 since the two search spaces are in some sense identical. But the group elements under the 

left hand node precede those under the right hand node in the lexicographic ordering, so it 
follows that the lexicographically least element (which is all that we're looking for) is not 
under the right hand node, which can therefore be pruned. The search space becomes as 
shown in Figure 10. 

35 This particular technique is quite general: whenever we are searching for a group element 

with a particular property, we can restrict our search to lex leaders of the set of all such 
elements and prune the search space on that basis. A more complete discussion in the context 
of the ^-transporter problem specifically can be found in Section 5.5. 

Finally, we note that the two remaining leaf nodes are equivalent, since they refer to the 

40 same instance - once we know the images of a and of 6, the overall instance is fixed and 
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no further choices are relevant. So assuming that the variables in the problem are ordered 
so that those in the clause are considered first, we can finally prune the search below depth 
three to get the structure shown in Figure 11. Only a single leaf node need be considered. 

3.4 Stochastic methods 

5 Finally, we should at least remark on the use of stochastic methods to solve problems in 
computational group theory. This is an area of considerable current research in the com- 
putational group theory community, and some embodiments of our ideas will include the 
application of stochastic techniques to solve group-theoretic problems. 

Essentially, the aim is to replace a deterministic method for computing the solution 

10 to a group theoretic problem (such as an instance of the ^-transporter problem, or the 
construction of a stabilizer chain) with a stochastic method that runs more quickly and has 
a high probability p of returning the correct answer. These techniques are referred to as 
"Monte Carlo" methods by the computational group theory community, and a variety of 
such techniques are known and appear in both Seress' book and GAP. Running times for 

15 large groups are generally better than for deterministic techniques, and the probabilities of 
failure are extremely small in practice. 

Recent work has focused on the lifting of Monte Carlo methods to so-called Las Vegas 
methods, which have the property that in some cases, the answer returned is guaranteed 
to be correct. As an example, an algorithm that returns a candidate solution to the k- 

20 transporter problem can check that answer before returning it, so that if a group element is 
returned, it is guaranteed correct; if the algorithm reports failure, the problem might in fact 
be solvable after all. 

The advantage of Las Vegas methods is that the probability that the answer is correct 
can generally be increased by running them repeatedly with different random number seeds. 
25 In this particular example, we can invoke the procedure repeatedly on a single instance of the 
A:- transporter problem, either stopping when a solution is found or increasing our confidence 
that no such answer exists. This idea is related to the use of random restarts in combinatorial 
search generally. 

4 Augmented resolution 

30 We now turn to our ZAP-specific requirements. First, we have the definition of augmented 
resolution, which involves computing the group of stable extensions of the groups appearing 
in the resolvents. Specifically, we have augmented clauses {c^Gi) and (c2,G 2 ) and need to 
compute the group G of stable extensions of Gi and G 2 . Recalling Definition 2.11, this is 
the group of all permutations u with the property that there is some gi G G\ such that 

35 =g 1 \ Gl 

and similarly for G2 and C2 (note that we have adjusted notation here, replacing the Gife) 
in the original definition 2.11 with cf*). We are viewing the clauses as sets here, with cf* 
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as usual being the image of the set under the given permutation group. 
As an example, consider the two clauses 

(ci,G 1 ) = (aV6,((ad) ) (6e),(6/)» 

and 

(c2,G2) = (cV6,((6e),(6p)» 

The image of c x under G x is {a,b,d,e,f} and c^ 2 = {b,c,e,g}. We therefore need to 
find a permutation u such that when oj is restricted to {a, 6, d, e, /}, it is an element of 
((ad), (6e), (6/)), and when restricted to {b, c, e, #} is an element of ((be), (&#)). 

Prom the second condition, we know that c cannot be moved by u, and any permutation 
of 6, e and g is acceptable because (be) and (bg) generate the symmetric group Sym(6, e,g). 
This second restriction does not impact the image of a, d or / under u. 

Prom the first condition, we know that a and d can be swapped or left unchanged, and 
any permutation of b, e and / is acceptable. But recall from the second condition that we 
must also permute b, e and g. These conditions combine to imply that we cannot move / or 
g, since to move either would break the condition on the other. We can swap b and e or not, 
so the group of stable extensions is ((ad), (be)), and that is what our construction should 
return. 

Procedure 4.1 Given augmented clauses (ci,Gi) and (c2,G 2 ), to compute stab(cj,Gi): 

1 cimage! <— cf 1 , c_image 2 <- c£ 2 

2 gjrestrict^ <- Gi| cjBBgBlJ g_restrict 2 <- G 2 |cj»age 2 

3 C n «— c_image 1 n c_image 2 

4 g.stabi <- g_restrict 1{Cn} , g_stab 2 <- g_restrict 2{Cn} 

5 g.int <- g-stabjcn n g- st ab 2 | Cn 

6 {gi} <- {generators of g-int} 

7 Oh} <~ {9u lifted to g.stabj, {/ 2 t} «- {ft, lifted to g.stab 2 } 

8 {^2i} <~~ {^tlc_image 2 -c n } 

9 return (g_restrict 1Cn , g-restrict 2Cn , {l u • l' 2i }) 

Proposition 4.2 77&e result returned by Procedure 4.1 is stab(cj,Gt). 

Let us present an example of the computation in use. We will then present the proof and 
discuss the computational issues surrounding Procedure 4.1. The example we will use is that 
with which we began this section, but we modify Gi to be ((ad), (be), (bf), (xy)) instead of 
the earlier ((ad), (be), (bf)). The new points x and y don't affect the set of instances in any 
way, and thus should not affect the resolution computation, either. 

1. c.image^ <- cf 1 . This amounts to computing the images of the under the Gf, as 
described earlier, we have c_image 1 == {a, b, d, e, /} and c_image 2 = {b, c, e, g}. 
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2. g.restrict^ <- Gil^i^*.. Here, we restrict each group to act only on the correspond- 
ing c-image^. In this example, g_restrict 2 = G 2 but g-restricti = ((ad), (fee), (bf)) 
as the irrelevant points x and y are removed. 

Note that it is not always possible to restrict a group to an arbitrary set; one cannot 
restrict the permutation (xy) to the set {x} because you need to add y as well. But in 
this case, it is possible to restrict Gi to c.image^ since this latter set is closed under 
the action of the group. 

3. C n «- c_image 1 nc..image 2 . The construction itself works by considering three separate 
sets - the intersection of the images of the two original clauses (where the computation 
is interesting because the various u; must agree), and the points in only the image of 
Ci or only the image of c 2 . The analysis on these latter sets is straightforward; we just 
need cj to agree with any element of G\ or G 2 on the set in question. 

In this step, we compute the intersection region C n . In our example, C n = {b, e}. 

4. g-stafy «— g-restrict^^j. We find the subgroup of g-restrict^ that set stabilizes 
C n , in this case the subgroup that set stabilizes the pair {b, e}. For gjrestrict! = 
((ad), (be), (bf)), this is ((ad), (be)) because we can no longer swap b and /, while for 
g_restrict 2 = ((be), (bg)), we get g_stab 2 = ((be)). 

5. g-int <r- g_stab 1 |cn n g- sta b 2 |crv Since uj must simultaneously agree with both Gi and 
G 2 when restricted to C n (and thus with g-restrict^ and g_restrict 2 as well), the 
restriction of u; to C n must lie within this intersection. In our example, g-int = ((be)). 

6 - {&} <- {generators of g-int}. Any element of g-int will lead to an element of the 
group of stable extensions provided that we extend it appropriately from C n back to 
the full set cf 1 U c§ 2 ; this step begins the process of building up these extensions. It 
suffices to work with just the generators of g-int, and we construct those generators 
here. We have {gi} = {(be)}. 

7. {hi} «- {9i, lifted to g_stab fc }. Our goal is now to build up a permutation on c.image^ 
c_image 2 that, when restricted to Cn, matches the generator g { . We do this by lifting 
gi separately to c_image 1 and to c_image 2 . Any such lifting suffices, so we can take 
(for example) 

hi = (be)(ad) 

and 

hi = (be) 

In the first case, the inclusion of the swap of a and d is neither precluded nor required; 
we could just as well have used In = (be). 

.image 2 -c n }* We cann ot simply compose in and l 2 \ to get the desired 
permutation on C-image! U c_image 2 because the part of the permutations acting on 
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the intersection c.image! fl c.image 2 will have acted twice. In this case, we would get 
hi ' hi = (ad) which no longer captures our freedom to exchange b and e. 

We deal with this by restricting hi away from C n and only then combining with l n > In 
the example, restricting (be) away from C n = {6, e} produces the trivial permutation 

4 = 0- 

9. Return (gjrestrict 1Cn ,g_restrict 2Cn , {lu • l f 2i }). We now compute the final answer 
from three sources: The combined lu • V 2i that we have been working to construct, along 
with elements of gjrestrict^ that fix every point in the image of C2 and elements of 
g_restrict 2 that fix every point in the image of c x . These latter two sets consist of 
stable extensions. An element of g -restrict^ pointwise stabilizes the image of C2 if 
and only if it pointwise stabilizes the points that are in both the image of Ci (to which 
gjrestrict! has been restricted) and the image of c 2 ; in other words, if and only if it 
pointwise stabilizes Co- 
in our example, we have 

g_restrict 1Cn = ((ad)) 
g_restrict 2Cn = 1 

Oii-U =. {(be)(ad)} 

so that the final group returned is 

((ad),(be)(ad)) 

This group is identical to 

((ad), (be)) 

We can swap either the (a, d) pair or the (6, e) pair, as we see fit. The first swap (ad) 
is sanctioned for the first "resolvent" (c u G x ) = (a V 6, {(ad), (be), (bf))) and does not 
mention any relevant variable in the second (c2, G 2 ) = (c V 6, ((be), (bg))). The second 
swap (be) is sanctioned in both cases. 



Proposition 4.2 The result returned by Procedure 4.1 is stab(cj, Cr*). 

Computational issues We conclude this section by discussing some of the computational 
issues that arise when we implement Procedure 4.1, including the complexity of the various 
operations required. 

1. c.image^ <- e { \ Efficient algorithms exist for computing the image of a set under a 
group. The basic method is to use a flood-fill like approach, adding and marking the 
result of acting on the set with a single element, and recurring until no new points are 
added. 
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2. g_restrict i <- C?i | c _image r A group can be restricted to a set that it stabilizes by 
restricting the generating permutations individually. 

3. C n <— c^image 1 n c_image 2 . Set intersection is straightforward. 

4. g-stafy «- g_restrict^ Cn j. Set stabilizer is not straightforward, and is not known to 
be polynomial in the number of generators of the group being considered. The most 
effective implementations work with a coset decomposition as described in Section 3.2; 
in computing G{s] for some set S, a node can be pruned when it maps a point inside 
of S out of S or vice versa. Gap implements this. 

5. g-int «- g-stabjcn n g- stab 2lc n - Group intersection is also not known to be poly- 
nomial in the number of generators; once again, a coset decomposition is used. Coset 
decompositions are constructed for each of the groups being combined, and the search 
spaces are pruned accordingly. Gap implements this as well. 

6* {di} {generators of g-int}. Groups are typically represented in terms of their 
generators, so reconstructing a list of those generators is trivial. Even if the generators 
are not known, constructing a strong generating set is known to be polynomial in the 
number of generators constructed. 

7. {ki} <- lifted to g.stabfc}. Suppose that we have a group G acting on a set T, 
a subset V C T and a permutation h acting on V such that we know that h is the 
restriction to V of some g € G, so that h = g\ v . To find such a g, we first construct a 
stabilizer chain for G where the ordering used puts the elements of T — V first. Now 
we are basically looking for a g G G such that the sifting procedure of Section 3.1 
produces h at the point that the points in T - V have all been fixed. We can find such 
an g in polynomial time by inverting the sifting procedure itself. 

8- {l f 2i} {^2tlc_image 2 -c n }- ^ s * n ^ ne 2 ' restriction is still easy. 

9. Return (g_restrict 1Cn ,g_restrict 2( 7 n , {lu - l r 2i }). Since groups are typically repre- 
sented by their generators, we need simply take the union of the generators for the three 
arguments. Pointwise stabilizers (needed for the first two arguments) are straightfor- 
ward to compute using stabilizer chains. 

5 Unit propagation and the (ir) relevance test 

As we have remarked, the other main computational requirement of an augmented satisfia- 
bility engine is the ability to solve the fc-transporter problem: Given an augmented clause 
(c, G) where c is once again viewed as a set of literals, and sets S and U of literals and an 
integer fc, we want to find a g G G such that c 9 n S = 0 and \c 9 n U\ < k, if such a g 
exists. (Once again, we have changed to the notation where group elements are written as 
exponents of the objects on which they are acting.) 
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5.1 A warmup 

We begin with a somewhat simpler problem, assuming that U = 0 so we are simply looking 
for a g such that c g C\S = 0. . 
We need the following definition: 

Definition 5.1 Let H < G be groups. By a transversal of H in G we will mean any subset 
of G that contains one element of each coset of H. We will denote such a transversal by 
G/H. 

Note that since H itself is one of the cosets, the transversal must contain a (unique) element 
of H. We will generally assume that the identity is this unique element. 

Procedure 5.2 Given groups H < G, an element t G G, sets c and S, to find a group 
element g = map(G, H, t, c, S) with g G H and c gt n S = 0: 

1 F «— {a G c such that a is fixed by H} 

2 ifF*nS^0 

3 then return failure 

4 ifcCF 

5 then return 1 

6 a <— an element of c — F 

7 for each tf in H/H a 

8 dor^'map(G,Jf a ,t%c,S) 

9 if r / FAILURE 

10 then return rtf 

11 return failure 

This is essentially a codification of the example that was presented in Section 3.2. We 
terminate the search when the clause is fixed by the remaining group H, but have not 
yet included any analog to the lex-leader pruning that we discussed in Section 3.3. In the 
recursive call in line 8, we retain the original group, for which we will have use in subsequent 
versions of the procedure. 

Proposition 5.3 map(G, G, 1, c, S) returns an element g G G for which c? C\S = 0, if such 
an element exists, and returns failure otherwise. 

Given that the procedure terminates the search when all elements of c are stabilized by 
G but does not include lex-leader considerations, the search space examined in the example 
from Section 3.2 is as shown by the structure in Figure 12. It is still important to prune 
the node in the lower right, since for a larger problem, this node may be expanded into a 
significant search subtree. We discuss this pruning in Section 5.5. 
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In the interests of clarity, let us go through the example explicitly. Recall that the clause 
c = Xi V #2, G = Sym(xi, £ 3 , x 4 ) permutes the x { arbitrarily, and that S = {xi, x 2 }. 

On the initial pass through the procedure, F = 0; suppose that we select xi to stabilize 
first. Step 7 now selects the point to which x\ should be mapped; if we select X\ or x 2 , then 
5 xi itself will be mapped into S and the recursive call will fail on line 3. So suppose we pick 

x 3 as the image of xi. 

Now F = {xi}, and we need to fix the image of another point; x 2 is all that's left in 
the original clause c. As before, selecting Xi or x 2 as the image of x 2 leads to failure. x 3 is 
already taken (it's the image of Xi), so we have to map x 2 into x 4 . Now every element of c is 
10 fixed, and the next recursive call returns the trivial permutation on line 5. This is combined 

with (x 2 x 4 ) on line 10 in the caller as we fix x 4 as the image of x 2 . The original invocation 
then combines with (xix 3 ) to produce the final answer of (xix 3 )(x 2 x 4 ). 

5.2 The /.-transporter problem 

Extending the above algorithm to solve the /^-transporter problem is straightforward; in 
15 addition to requiring that F t n S = 0 in line 3, we also need to keep track of the number of 

points that have been mapped into the set U and make sure that we haven't exceeded the 
limit k: 

Procedure 5.4 Given groups H < G, an element t e G, sets c, S and U and an integer 
k, to find a group element g = transport (G, H, t, c, 5, U, k) with g G H, c gt D S = 0 and 
20 \d* HU\<k: 

1 F <— {a G c such that a is fixed by H} 

2 ifF*nS^0 

3 then return failure 

4 if\F t nu\>k 

5 then return failure 

6 ifcCF 

7 then return 1 

8 a an element of c — F 

9 for each H in H/H a 

10 dorf- transport (G, H a , t% c, 5, U 9 k) 

11 if r / FAILURE 

12 then return rtf 

13 return failure 

Proposition 5.5 transport (G, G, 1, c, 5, 17, k) returns an element g G G for which c 9 f)S = 
0 and \cP n U\ < k, if such an element exists, and returns failure otherwise. 
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The procedure is simplified significantly by the fact that we only need to return a single 
g with the desired properties, as opposed to all of them. But it might seem that it would 
be more efficient, when looking for unit instances of (c, G), to return all such instances as 
opposed to only one. 

5 This is not the case. If a single unit instance is found, setting the unvalued literal it 

contains may well cause (c, G) to acquire new unit instances, and the search will therefore 
need to be repeated in any event. 

In what follows, we will investigate a variety of techniques that generalize the prune at 
lines 4-5 of Procedure 5.4. It will therefore be convenient to rewrite the procedure as: 

10 Procedure 5.6 Given groups H <G, an element t G G } sets c, S and U and an integer 

k, to find a group element g = transport (G, H, t, c, 5, E/, k) with g G H, c gt n S = 0 and 
{cP* n U\ < k: 

1 F «- {a G c such that a is fixed by H} 

2 ifF'nS/0 

3 then return failure 

4 if overlap(ff , c, (S U Uf 1 ) > k 

5 then return failure 

6 ifcCF 

7 then return 1 

8 a «— an element of c — F 

9 for each t 1 in H/H a 

10 dorf- transport (G, H a , t% c, S, U, k) 

11 if r / FAILURE 

12 then return rt* 

13 return failure 

In line 4, we use an auxiliary function that determines the minimum overlap between c 9 
15 and (5 U U) f . Our initial version of overlap is: 

Procedure 5.7 Given a group H, and two sets c, V, to compute overlap(if, c, V), a lower 
bound on the overlap ofc* 1 and V for any h e H: 

1 F <- {a e c such that a is fixed by if} 

2 return \FnV\ 

Proposition 5.8 transport (G, G, 1, c, 5, t/, &) as computed by Procedure 5.6 returns an el- 
20 ement g G G for which c 9 D S = 0 and |c* D C/| < fc ; if such an element exists, and returns 

failure otherwise. 
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5.3 Orbit pruning 

There are two general ways in which nodes can be pruned in the ^-transporter problem. 
Lexicographic pruning is a bit more difficult, so we defer it until Section 5.5. To understand 
the other, consider the following example. 

Suppose that c = Xi Vx 2 Vx 3 and that the group G permutes the variables {x u x 2 , x 3 , x 4 , x 5 , x 6 } 
arbitrarily. If S = {x u x 2 , x 3 , x 4 }, is there a g G G with c 9 f)S = 0? 

Clearly not; there isn't enough "room" because the image of c will be of size three, and 
there is no way that this 3-element set can avoid the 4-element set S in the 6-element universe 
{£i,X2,x 3 ,x 4 ,X5,x 6 }. 

We can do a bit better in many cases. Suppose that our group G is ((0:1X4), (x 2 x 5 ), (x 3 x 6 )) 
so that we can swap x x with x 4 (or not), x 2 with x 5 , or x 3 with x 6 . Now if S = {x i5 x 4 }, can 
we find a g <E G with c 9 D S = 0? 

Once again, the answer is clearly no. The "orbit" of x x in G is {x x ,x 4 } and since 
{xi,x 4 } C 5, Xi's image cannot avoid the set S. 

To formalize this, we first define an orbit: 

Definition 5-9 Let G be a group acting on a set T. Then an orbit of G is a minimal subset 
V ofT that is closed under G, so that V G = V. For a specific element x g T, the orbit of x 
is the orbit of G that contains x. 

Consider now the initial call, where t = 1, the identity permutation. Given the group G, 
consider the orbits of the points in c. If there is any such orbit W for which \Wr\c\ > \W-S\, 
we can prune the search. The reason is that each of the points in W n c must remain in W 
when acted on by any element of G\ that is what the definition of an orbit requires. But 
there are too many points in W n c to stay away from 5, so we will not manage to have 
c 9 n S = 0. 

What about the more general case, where t / 1 necessarily? For a fixed a in our clause c, 
we will construct the image a 9t , acting on a first with g and then with t. We axe interested 
in whether a 9t e S or, equivalently, if a 9 G S^ 1 . Now a 9 is necessarily in the same orbit as 
a, so we can prune if 

\wnc\ > {w-s^l 

For similar reasons, we can also prune if 

|wnc| > \w-u^\ + k 

In fact, we can prune if 

\wnc\>\w-(suu) t ' 1 \ + k 

because there still is not enough space to fit the image without either intersecting S or 
putting at least k points into U. 

We can do better still. As we have seen, for any particular orbit, the number of points 
that will eventually be mapped into U is at least 

Iwncl-lW'-CSut/)*" 1 ! 
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In some cases, this expression will be negative; the number of points that will be mapped 
into U is at least 

max{\W n c\ - \W - (5 U 1, 0) 
and we can prune any node for which 

53 max(\W H c| - \W - (S U t/) t_1 1, 0) > (6) 

where the sum is over the orbits of the group. 

It will be somewhat more convenient to rewrite this using the fact that 

|wnc| + |vr-c| = \w\ = \wn(suu) t ~ 1 \ + \w-(suu) t ~ 1 \ . 

so that (6) becomes 

£ max(|^ n (S U try 1 \ - \ W - c|, 0) > k (7) 
w 

Incorporating this into Procedure 5.7 gives: 

Procedure 5.10 Given a group H, and two sets c, V, to compute overlap (i?, c, V), a lower 
bound on the overlap ofc* 1 and V for any h G H: 

1 m <- 0 

2 for each orbit W of H 

3 do m <- m + max(\W n V\ - \W - c|, 0) 

4 return m 

Proposition 5.11 Let H be a group and c,V sets acted on by H. Then for any h G H, 
\(^ n V | > overlap (if, c, V) where overlap is computed by Procedure 5.10. 

5.4 Block pruning 

The pruning described in the previous section can be improved further. To see why, consider 
the following example, which might arise in solving an instance of the pigeonhole problem. 
We have the two cardinality constraints: 

^1+^2 + ^3 + ^4 > 2 (8) 

x 5 + x 6 + x 7 + xs > 2 (9) 

presumably saying that at least two of four pigeons are not in hole m and at least two are 
not in hole n for some m and n. (In an actual pigeonhole instance, all of the variables would 
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be negated. We have dropped the negations for convenience.) Rewriting the individual 
cardinality constraints as augmented clauses produces 



or, in terms of generators, 



(xi V x 2 V x 3 , Sym(xi, x 2 , £3, x*)) 
(x 5 V x 6 V x 7 , Sym(x5, x 6) x 7 , x 8 )) 



(xi V x 2 V x 3 , ((xix 2 ), (x 2 x 3 x 4 ))) (10) 
(x 5 V x 6 V x 7 , <(x 5 x 6 ), (x 6 x 7 x 8 ))) (11) 



What we would really like to do, however, is to capture the full symmetry in a single axiom. 
We can do this by realizing that we can obtain (11) from (10) by switching Xi and x 5 , 
10 x 2 and x 6 , and x 3 and x 7 (in which case we want to switch X4 and x 8 as well). So we add 

the generator (xix 5 )(x 2 x 6 )(x3X 7 )(x4X8) to the overall group, and modify the permutations 
(xix 2 ) and (x 2 x 3 x 4 ) (which generate Sym(xi,x 2 ,x 3 ,x 4 )) so that they permute x 5 ,x 6 ,x 7 ,X8 
appropriately as well. The single augmented clause that we obtain is 

(xi V x 2 V x 3) ((xix 2 )(x 5 x 6 ), (x 2 x 3 x 4 )(x 6 x 7 x 8 ), (xix 5 )(x 2 x 6 )(x 3 x 7 )(x 4 x 8 ))) (12) 

15 and it is not hard to see that this does indeed capture both (10) and (11). 

Now suppose that xi and X5 are false, and the other variables are unvalued. Does (12) 
have a unit instance? 

With regard to the pruning condition in the previous section, the group has a single orbit, 
and the condition (with t = 1) is 

20 \wn(suu)\-\w-c\>i (13) 

But . ' 

W = {xi,x 2 ,x 3 ,x 4 ,x 5 ,x 6 ,x 7 ,x 8 } 

S = 0 

U = {x 2 ,x 3 ,x 4 ,x 6 ,x 7 ,x 8 } 

25 c = {xi,x 2 ,x 3 } 

so that \W n (S U U)\ = 6, \W - c| = 5 and (13) fails. 

But it should be possible to conclude immediately that there are no unit instances of (12). 
After all, there are no unit instances of (8) or (9) because only one variable in each clause 
has been set, and three unvalued variables remain. Equivalently, there is no unit instance of 
30 (10) because only one of {xi, x 2 , x 3 , x 4 } has been valued, and two need to be valued to make 

Xi Vx 2 Vx 3 or another instance unit. Similarly, there is no unit instance of (11). What went 
wrong? 

What went wrong is that the pruning heuristic thinks that both x x and x 5 can be mapped 
to the same clause instance, in which case it is indeed possible that the instance in question 
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be unit. The heuristic doesn't realize that xi and x b are in separate "blocks" under the 
action of the group in question. 

To formalize this, let us first make the following definition: 

Definition 5,12 Let T be a set, and G a group acting on it We will say that G acts 
transitively onTifT is an orbit ofG. 

Put somewhat differently, G acts transitively on T just in case for any x,y GT there is 
some g E G such that x g = y. 

Definition 5.13 Suppose that a group G acts transitively on a set T. Then a block system 
for G is a partitioning ofT into sets B u ...,B n such that G permutes the B { . 

In other words, for each g eG and each block B { , Bf = Bj for some j; the image of B { 
under g is either identical to it (if j = %) or disjoint (if j ^ i, since the blocks partition T). 
Every group acting transitively and nontrivially on a set T has at least two block systems: 

Definition 5.14 For a group G acting transitively on a set T, a block system B u ...,B n 
will be called trivial if either n = 1 or n = \T\. 

In the former case, there is a single block consisting of the entire set T (which obviously 
is a block system). If n = |T|, each point is in its own block; since G permutes the points, 
it obviously permutes the blocks. 

Lemma 5.15 All of the blocks in a block system are of identical size. 

In the example we have been considering, B x = {x u x 2 , x 3 , x 4 } and B 2 = {x 5 , x 6 , x 7 , x s ] 
is also a block system for the action of the group on the set T = {x u x 2 , z 3 , x 4 , x 5 , x 6 , x 7 , x 8 }. 
And while it is conceivable that a clause is unit within the overall set T, it is impossible for 
it to have fewer than two unvalued literals within each particular block. Instead of looking 
at the overall expression 

\Wn{SWU)\ - \W-c\ > 1 (14) 

we can work with individual blocks. 

The clause x\ Vx 2 V x 3 is in a single block in this block system, and will therefore remain 
in a single block after being acted on with any g e G. If the clause winds up in block i^, 
then the condition (14) can be replaced with 

\Btn(S\JU)\-\Bi-c\>l 

or, in this case, 

\Bin(SUU)\ > \Bi-c\ + 1 = 2 

so that we can prune if there are more than two unvalued literals in the block in question. 
After all, if there are three or more unvalued literals, there must be at least two in the clause 
instance being considered, and it cannot be unit. 



72 



Case 8585 

Of course, we don't know exactly which block will eventually contain the image of c, but 
we can still prune if 

min(| J B i n(5uC/)|) > 2 

since in this case any target block will generate a prune. And in the example that we have 
been considering, 

|£<n(SU£/)| =3 

for each block in the block system. 

Generalizing this idea is straightforward. For notational convenience, we introduce: 

Definition 5.16 Let T = {Tj} be sets, and suppose that T il5 . . . ,T in are the n elements of 
T of smallest size. Then we will denote Y%=i 1^1 by min+ n {Ti}. 

Proposition 5.17 Let G be a group acting transitively on a setT, and let c, V C T. Suppose 
also that {B { } is a block system for G and that cC\Bi^0 for n of the blocks in {Si}. Then 
if b is the size of an individual block Bi and g €lG, 

\c° n V\ > \c\ + min +n {Bi n V} - nb (15) 

Proposition 5.18 If the block system is trivial (in either sense), (15) is equivalent to 

\d* nv\ > \THV\ - \T-c\ (16) 

Proposition 5.19 Let {Bi} be a block system for a group G acting transitively on a set T. 
Then (15) is never weaker than (16). 

In any event, we have shown that we can strengthen Procedure 5.10 to: 

Procedure 5.20 Given a group H, and two sets c, V, to compute overlap(#, c, V), a lower 
bound on the overlap of d 1 and V for any h G H: 

1 m <- 0 

2 for each orbit W of H 

3 do {Bi} <- a block system for W under H 

4 n = \{i\Binc^0}\ 

5 m 4- m + max(|c| + mm +n {fi< n V} - n|Bi|,0) 

6 return m 

In practice, the best choice of a block system for line 3 of the procedure appears to be 
a minimal block system (i.e., one with blocks of the smallest size) for which c is contained 
within a single block. Now Procedure 5.20 becomes: 
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Procedure 5.21 Given a group H, and two sets c, V, to compute overlap(if, c, V), a lower 
bound on the overlap of d 1 and V for any he H: 

.1 mf-0 

2 for each orbit W of H 

3 do {Bi} <r- a minimal block system for W under H for which c C B { for some i 

4 m «- m + maxflcj + min^ n V) - 0) 

5 return m 

Proposition 5.22 Let H be a group and c, V' sets acted on by H. Then for any h € H, 
5 \d n V\ > overlap(if, c, V) where overlap is computed by Procedure 5.21. 

Note that the block system being used depends only on the group H and the original 
clause c. This means that in an implementation it is possible to compute these block systems 
once and then use them even if there are changes in the sets S and U of satisfied and unvalued 
literals respectively. 

10 Gap includes algorithms for finding minimal block systems for which a given set of 

elements (called a "seed" in gap) is contained within a single block. The basic idea is to 
form an initial block "system" where the points in the seed are in one block and each point 
outside of the seed is in a block of its own. The algorithm then repeatedly runs through the 
generators of the group, seeing if any generator g maps elements x, y in one block to x 9 and 

15 y 9 that are in different blocks. If this happens, the blocks containing x 9 and y 9 are merged. 

This continues until every generator respects the candidate block system, at which point the 
procedure is complete. 

5.5 Lexicographic pruning 

Block pruning, however, will not help us with the example at the end of Section 5.1. The 
20 final space being searched is illustrated by Figure 13. As we have remarked, the first fringe 

node (where a is mapped to c and b to d) is essentially identical to the second (where a 
is mapped to d and b to c). It is important not to expand both since more complicated 
examples may involve a substantial amount of search below the nodes that are fringe nodes 
in the above figure. 

25 This is the sort of situation in which lexicographic pruning can generally be applied. We 

want to identify the two fringe nodes as equivalent in some way, and then expand only the 
lexicographically least member of each equivalence class. For any particular node n, we need 
a computationally effective way of determining if n is the lexicographically least member of 
its equivalence class. 

30 We begin by identifying conditions under which two nodes are equivalent. To lift the 

problem to a more general (and more easily described) setting, suppose that we have a 
group G acting on a set T. We have a Boolean function b : V(T) -> {true, false} that 
takes a subset V C T (an element ofV(T), the power set of T) and returns true if V meets 
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some (otherwise unspecified) condition and false if V does not meet the condition. Given 
a fixed subset c C T, we want to use coset decomposition to find ajGG such that b{c?) is 
true, if such a g G G exists. As the search proceeds, under what conditions are two nodes n 
and n' equivalent? 

5 We assume first that the nodes are both at the same depth d in the search tree, and that 

the points that have been stabilized in getting to this depth are a = {a u . . . , a d }. For each 
node, the residual group to be expanded is G a , but there are different coset representatives 
t and 1/ leading from the root of the tree to n and n' respectively. 

The fringe nodes under n correspond to group elements gt for the various g G G a , and 
10 those under n' correspond to gt*. In order for the two nodes to be equivalent, we need for 

the set of gt to be equivalent to the set of gtf when acting on the set c. In other words, for 
any g G G a , we need there to exist a g' G G a such that c 9t = c 9 '* \ 

While this does indeed define an equivalence relation, it is not one that we are able to 
compute effectively. So instead of requiring that the fringe nodes under v! be a permutation 
15 of those under n, we make the more restrictive requirement that they match exactly, so that 

c 9 * = <** (17) 

for any g G G a . 

In fact, we will be more restrictive still. Let p denote the set of elements of T that are 
fixed by G a . (Obviously a C /3, but the inclusion need not be proper if fixing some o>i also 
20 fixes other elements of the set T.) Instead of requiring that c 9% = c 3 *' for the clause in its 

entirety, we require that the portions of c that are fixed or not fixed by G a match up. In 
other words, we require that 

(cn^ = (cn^ (18) 

and 

25 (c-0)* = (c-/?)*' (19) 

These two conditions obviously suffice to ensure that n and n f are equivalent. 

Consider (18). first. Since G a point stabilizes it follows that (cdfi) 9 = cD fa and the 
condition becomes 

30 We rewrite this as 

(en/?)"' -1 =cnp 

or, equivalently, 

W' 1 E G {cnp} (20) 

saying that tf- 1 is in the set stabilizer of cD @. Note that (20) simply says that t and f are 
35 in the same coset of G^py 

The second condition will require a bit more work. We begin by rewriting (19) as 

(c - P) 3 *- 1 = (c - ft (21) 
This is an instance of the following: 
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Definition 5.23 Let G be a group acting on a set T, and suppose that H <G and V C T. 
We will say that g G G is (V, i/)-transparent if for any h e H, V hg = V h . IfV and H 
are obvious from context, we will simply say that g is transparent. The set of all (V,ff)- 
transparent elements ofG will be denoted G( ViH )- 

Lemma 5.24 G(y iH ) < G. 

In other words, the set of transparent elements is a subgroup of G. 

Our requirement (21) is now clearly equivalent to the requirement that ttf~ l be (c-0, G a )~ 
transparent. But G a = Gp, since they are both the point stabilizers of fi. Putting together 
all of the pieces, we have shown: 

Proposition 5.25 Let G be a group acting on a set T, and suppose that b is a Boolean 
function on the subsets ofT. Fix cCT and /}CT. For t,t* eG, suppose that 

tf'^G^nG^) (22) 

Then there will be a g € G^t for which b(c 9 ) is true if and only if there is a g f G Gpt' for 
which b{c 9 ') is true. 

In other words, the search nodes corresponding to t and to tf are equivalent. 

Note that the intersection on the righthand side of (22) is the intersection of two subgroups 
of G, and is itself therefore a subgroup of G. If we denote this subgroup by Z, the equivalence 
condition of the proposition is simply that t and 1f be in the same coset of Z. 

In order to use this idea in practice, we need to be able to do the following: 

1. Compute <2{ c n/3} 5 

2. Compute G^ Gp) , 

3. Compute the intersection G {cn ^} nG( c _^>, and 

4. Determine if a specific t will be pruned by some other tf. 

The first of these tasks is a set stabilizer calculation and the third task is a group in- 
tersection. Both of these are described above. The second and fourth tasks are described 
below. 

To compute the transparent elements, we have the following: 

Procedure 5.26 Given a finite group G acting on a set T, a subgroup H = (hi) of G and 
a subset V C T, to compute G^h) - 

1 P <- { V 7 T - V }, a partition of T of size 2 

2 while there are P u Pj e P and an h k with 0 C P? k n Pj C Pj and both inclusions proper 

3 doP = P-{P j }u{P j nPt l \P j -P! lk } 

4 return n PieP G{ Pi } 
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Proposition 5.27 Procedure 5.26 returns G(y iH ). 

Before moving on, there are two things to note about Procedure 5.26. As we have 
remarked, we are essentially computing the set of points that simultaneously set stabilize V h 
for every h G if; the problem is that we cannot intersect all of these set stabilizers directly 
5 because H may be large. But the first three steps of Procedure 5.26 are of polynomial 

complexity. At each iteration, the partition P is refined, so that the number of iterations is 
bounded by the size of the set T that is being partitioned. 

Second, the intersection in the final step of the procedure should not be computed by 
computing all of the individual set stabilizers and then computing their intersection. Instead, 

10 recall that the set stabilizers themselves are computed using coset decomposition; if any 

stabilized point is moved either into or out of the set in question, the given node can be 
pruned. It is possible to modify the algorithm so that if any stabilized point is moved into or 
out of any of the sets being simultaneously stabilized, the node in question is pruned. This in 
fact makes line 4 faster than any of the individual set stabilizers, since the pruning condition 

15 is more general. Alternatively, we can compute and return the equivalent \\J PieP Sym(Pi)] n 

G, which can be constructed with existing machinery. The groups are equal but the previous 
method can be expected to be faster. 

This brings us to the last computational requirement in lexicographic pruning. Given a 
specific t G G, how can we tell if there is another t' that will prune it? 

20 We can prune t if there is an equivalent t r labeling a different node with f < t lexico- 

graphically. Note that we need to be careful that t 1 label a different node, lest we prune t 
by virtue of a node in tfs coset itself. Equivalently, if t 0 is the least element of tfs subtree 
G a i, we can prune t if there is any t f G Zt with 1? <t 0 . 

As we will see, computing to is relatively straightforward, but testing if to is minimal in 

25 its coset Zt 0 is not (again, as we will see shortly). It is easier to see if t 0 is minimal in its 

left coset t 0 Z. 

This turns out to be good enough. The point of the minimality test is to get one element 
of each coset; if z G Zt, then z~ l G the left coset of t~ l in Z. So we can select a unique 
element by ensuring that t§ x is minimal in its left coset in Z. Of course, we still have to 
30 be careful not to prune a node based on another node in the same subtree; we now have to 

select t 0 to be that element of tfs subtree for which t$ l is minimal. Given that i's subtree 
consists of all points of the form G a i, we want the £ 0 for which t$ 1 is the minimal element 
of the left coset t~ l G a . 

Definition 5.28 Let H < G be groups, and g G G. We will denote by min(gH) the lexi- 
35 cographically least element of the coset gH and by is~min(g, H) the fact that g is the lexico- 

graphically least element of gH. 

We have shown that we can add lexicographic pruning to our /^-transporter procedure as 
follows: 

Procedure 5-29 Given groups H <G, an element t G G, sets c, S and U and an integer 
40 k, to find a group element g = transport (G, H, t, c, S, U, k) with g G H, c 9t D S = 0 and 

|c*nl/|<fc; 
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1 F «— {a G c such that a is fixed by H} 

2 ifF t HS^0 

3 then return failure 

4 if overlap(if, c, (S U t/) t_1 ) > & 

5 then return failure 

6 ifcCF 

7 then return 1 

8 'if is_min(min(t" 1 i?) 5 G{ cnF } n G< c _ F}Gf )) is false 

9 then return failure 

10 a <— an element of c — F 

11 for each f in H/H a 

12 do r <- transport (G, if a , ft, c, S, [/, fc) 

13 if r ^ failure 

14 then return rf 

15 return failure 

It remains for us to present algorithms for computing min(gH) and isjnin(^, if). 

Procedure 5.30 Given a group G and subgroup H <G, along with permutations g,p G G, 
to compute min(<7, if,p) ; the lexicographically least element of the form ghp for h G if : 



1 if is trivial 

2 then return gp 

3 T <— the points that are moved by either g or if 

4 sort T= * n } 

5 for i = 1, . . . ,n 

6 doaf- the smallest element of t\ Hp 

7 select h such that (if) /l = a p ^ 

8 H 4- 

9 p <— hp 

10 if if is trivial 

11 then return #p 



Proposition 5.31 £e£ G be a group and H < G a subgroup, and g G G. Then the value 
returned by Procedure 5.30 as min(<7, if, ()) is min(#if). 

The technique for evaluating is_min is similar: 

Procedure 5.32 Given a group G and subgroup if < G, along with a g G G, to compute 
is.min(g, if), determining if g is the lexicographically least element of its left coset gH: 
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1 if if is trivial 

2 then return true 

3 T «— the points moved by g or H 

4 sortT = {t 1? ...,t n } 

5 for % = 1, . . . , n 

6 do if there is an h G H such that tf < if 

7 then return false 

8 H^H^ 

9 if H is trivial 

10 then return true 

Note that the check in line 6 is straightforward, since it involves simply computing the orbit 
of t\ under if. 

Proposition 5.33 The value returned by Procedure 5.32 is is-min(g, H). 

5 If we were to try to apply a similar procedure to check minimality in the right coset Hg, 

line 6 would involve examining all of the t* 9 instead of the tf 1 . The second set tf 1 can be 
easily computed, since it's just the orbit of if under H, but there is no apparent method for 
computing t*l 9 . This is why minimality in a left coset seems easier to test than minimality 
in a right coset. 

10 This is sufficient to prune the node in the lower right of Figure 13 with which we began 

this section, so that the search space in our running example finally becomes as shown in 
Figure 14 as desired. 

It might seem that we have brought too much mathematical power to bear on the it- 
transporter problem specifically, but we disagree. High-performance satisfiability engines, 

15 running on difficult problems, spend in excess of 90% of their CPU time in unit propagation, 

which we have seen to be an instance of the ^-transporter problem. Effort spent on improv- 
ing the efficiency of Procedure 5.29 (and its predecessors) lead to substantial performance 
improvements in the zap embodiment described here. 

While lexicographic pruning is important, it is also expensive. This is why we defer it to 

20 line 8 of Procedure 5.29. An earlier lexicographic prune would be independent of the S and 

U sets, but the count-based pruning is so much faster that we defer the lexicographic check 
to the extent possible. We will need to revisit this decision in the next section. 

5.6 Watched literals 

There is one pruning technique that we have not yet considered, and that is the possibility 
25 of finding an analog in our setting to Zhang and Stickel's watched literal idea. 

To understand the basic idea, suppose that we are checking to see if the clause a V b V c 
is unit in a situation where a and b are unvalued. It follows that the clause cannot be unit, 
independent of the value assigned to c. 
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At this point, we can watch the literals a and 6; as long as they remain unvalued, the 
clause cannot be unit. In practice, the data structures representing a and b include a pointer 
to the clause in question, and the unit test needs only be performed for clauses pointed to 
by literals that are changing value. 

Our situation is complicated by the fact that determining whether or not a clause is 
unit involves not a simple check, but an actual search. This introduces some ambiguity into 
Zhang and Stickel's idea: Is it sufficient to record simply the fact that a particular augmented 
clause (c, G) cannot be unit unless some literal changes value, or do we want to record the 
point in the search that is impacted as well? 

We will return to this question shortly. For the moment, however, let us formalize the 
basic idea itself. 

Definition 5.34 An instance of the k-transporter problem consists of a group G, together 
with a clause c, sets S and U, and a bound k. A solution to the instance is any g G G such 
that c 9 f)S = 0 and \c 9 n U\ < k. The instance will be denoted (G, c, 5, C/, k). 

A subinstance of the k-transporter problem consists of groups H < G, together with a 
permutation t e G, a clause c, sets S and U, and a bound k. A solution to the subinstance 
is any h G H such that n S = 0 and ^ n U\ < k. The subinstance will be denoted 
(G,tf,i,c,S, [/,*;). 

An instance or subinstance of the k-transporter problem will be called satisfiable if and 
only if it has a solution. 

We can now define a watching set as follows: 

Definition 5.35 Let I = (G, c, £, [/, A;) be an instance of the k-transporter problem. A 
watching set for J is any set W such that (G, c, S", U\ k) is unsatisfiable whenever S' D 
SnW and U r D U n W. Similarly, a watching set for a subinstance (G, H,t, c, S, [/, k) of 
the k-transporter problem is any set such that (G, H, £, c, S", t/', k) is unsatisfiable whenever 
S'DSnWandU'DUHW. 

Proposition 5.36 Every unsatisfiable instance or subinstance of the k-transporter problem 
has a watching set 

Zhang and Stickel treat satisfied and unvalued literals equally, instead of drawing the 
distinction between them that we do. They do this because (1) Given that no search is 
involved in their unit test, finding a second literal to add to a watching set is inexpensive, 
and (2) Most satisfied literals will eventually become unvalued as the search proceeds; by 
treating satisfied and unvalued literals identically, no adjustments need be made when the 
search backs up and a literal becomes unvalued. 

Definition 5.37 Let I = (G, c, 5, [/, k) be an instance of the k-transporter problem. A 
condensed watching set for / is any set W such that (G, c, 5', £/', k) is unsatisfiable whenever 
U f D U n W. Similarly, a condensed watching set for a subinstance (G y H,t,c,S,U,k) of 
the k-transporter problem is any set such that (G, if, i, c, S", t/', k) is unsatisfiable whenever 
U'DUHW. 
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Proposition 5.38 A set W is a condensed watching set for (G, c, 5, £/, k) if and only if 
(G, c, 5', [/', k) is unsatisfiable whenever S'UU' DUf)W. W is a condensed watching set for 
(G, H, t, c, 5, [/, k) if and only if (G, H, t, c, 5', (/', A:) unsatisfiable whenever S'\JU' D UnW. 

In other words, the definition of condensed watching sets allows us to think of satisfied 
5 and unvalued literals (in 5 and U respectively) as one. 

To see that our notion generalizes the earlier one, consider the following results: 

Proposition 5.39 Suppose I = (1, c, S, U, k) is an instance of the k-transporter problem. 
Then W is a watching set for I if and only if \W D c n S\ > 0 or \ W D c n U\ > k. W is a 
condensed watching set for I if and only if \W n c n (5 U U) \ > k. 

10 Taking A: = 1 obviously generalizes the existing notion. 

Proposition 5.40 Given a cardinality constraint c requiring at least m of the associated 
literals to be true, W is a condensed watching set for c if and only if it includes at least 
m + l literals in c. 

A cardinality constraint x\ H h x n > m is equivalent to the augmented clause 

15 (xi V - ■ ■ V x n _ m+i , Sym(x<)) 

and Proposition 5.40 is now generalized by: 

Proposition 5.41 Suppose I = (Sym(c'), c, 5, U, k) is an instance of the k-transporter prob- 
lem where c C d. Then W is a watching set for I if either \W D d n S\ > \d - c\ or 
\W n d n (S U U)\ > k + \d - c|. W is a condensed watching set for I if and only if 

20 \wndn(Suu)\>k+\d-c\. 

To see that this generalizes Proposition 5.40, note that for a cardinality constraint as in 
Proposition 5.40, we will have x n - m + 2 , • • . , x n in d - c, so that \d - c\ = m - 1. Taking k = 1 
in the conclusion of Proposition 5,41 allows us to conclude that W is a condensed watching 
set if and only if \W n d n (S U U)\ > m. In other words, W is a condensed watching set if 
25 it includes at least m+l satisfied or unvalued literals in the original cardinality constraint. 

Having considered these examples, we now return to the general construction. As we've 
remarked, the basic use for watching sets is to reduce the frequency with which clauses 
(augmented or not) must be examined to see if they have unit instances. 

To understand some of the difficulties involved, consider the augmented clause corre- 
30 sponding to the quantified clause 

Mxy . p(x) A g(y) -> r 

If P is the set of instances of p(x) and Q the set of instances of q(y), this becomes the 
augmented clause 

(-np(O) V i 9 (0) V r, Sym(P) x Sym(Q)) (23) 
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where p(0) and q(0) are elements of P and Q respectively. 

Now suppose that q(y) is true for all y, but p(x) is unvalued, as is r. Suppose also that 
we search for unit instances of (23) by first stabilizing the image of q and then of p (r is 
stabilized by the group Sym(P) x Sym(Q) itself). If there are four possible bindings for y 
5 (which we will denote 0, 1, 2, 3) and three for x (0, 1, 2), the search space is as shown in Figure 

15. In the interests of conserving space, we have written pi instead of p(i) and similarly for 

Each of the fringe nodes fails because both r and the relevant instance of p(x) are unval- 
ued. As we work to understand watched literals in this broader setting, the questions that 
10 we need to answer are the following: 

1. When a node fails because certain literals are satisfied or unvalued, what information 
should be passed back from the search? 

2. How is this information used to reduce the amount of search that must be performed 
when one of the literals in question changes value later? 

15 One approach would have us pass out only the literals that caused the failure, but (as 

we remarked earlier), this may cause us to lose significant amounts of information regarding 
portions of the search space that need not be reexamined. Perhaps it would make more sense 
to pass back not only the literals in question, but a description of the node itself, presumably 
by passing out the permutation t connecting the given node to the root. 

20 In this example, the responsible literals at each fringe node are as shown in Figure 16. 

If we simply accumulate these literals at the root of the search tree, we conclude that the 
reason for the failure is the watching set {po,Pi>P2,r} (which is indeed a watching set for 
this problem instance). The difficulty is that if any of these watched literals changes value, 
we potentially have to reexamine the entire search tree. 

25 We could, on the other hand, compute not a watching set for the instance corresponding 

to the entire tree, but a watching set for the su&instance corresponding to the node that 
actually fails. So the information accumulated from the tree in this example would actually 
be: 

{ ({Po,r},l) 5 ({Pi,r},(ft>pi)), ({P2,r},(p 0 P2)) 5 
({flDi '},(*«!)), ({Pi,r},(pt,Pi)(go9i)), (0>2,r}, (poftXfttfi)), 

({Piir},(poPi)(gbft)), (0>2,r} ? (poftXflofc)) } 
30 Each entry consists of a set of variables and the permutation leading to the node that will 

need to be reevaluated if one of the variables changes value. Now when some individual p { 
changes value, we only need to reexamine three nodes, as opposed to the entire search space. 

It is our belief that both of these approaches are wrong. If the search tree for this problem 
is to be reevaluated because the value of pi has changed, we should not be using a variable 
35 ordering for which the four appearances of pi remain separate. Instead, we should first 

reorder the variables chosen for stabilization, replacing the search space depicted above with 
the space illustrated in Figure 17. 
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Now only the center node needs reexpansion, since it is only at this node that the modified 
literal p\ appears. The search space becomes simply the one shown in Figure 18, which is 
what one would expect if p x changes value. 

What we are suggesting, then, is that the answers to the questions we posed previously 
be as follows: 

1. When a node fails because certain literals are satisfied or unvalued, what information 
should be passed back from the search? All that is required is a watching set - a list of 
literals that led to the failure. 

2. How is this information used to reduce the amount of search that must be performed 
when one of the literals in question changes value later? When the search is recon- 
sidered because a watched literal has changed value, we first stabilize clause elements 
that can be mapped to the literal in question, and we prune any node that cannot be 
impacted by the changed literal. 

If we modify the procedures developed thus far to incorporate this change, the results 
are as follows: 

Procedure 5.42 Given groups H <G, an element t e G, sets c, S and U, an integer k, and 
optionally a watched element w, to find a group element g = transport (G, H, i, c, S, U, k, w) 
with g e H, c 9t DS = 0, \c gt n U\ < k, and w € c gt if w is supplied: 

1 F <- {ae c such that a is fixed by H} 

2 ifwis supplied and w £ F l and w 1 " 1 0 c H 

3 then return (failure, 0) 

4 i/F*n5^0 

5 then return (failure, s) for any s G F 1 n S 

6 V ^overlap(//,c,(5uC/) t_1 ,A:) 

7 ifV^0 

8 then return (failure, V) 

9 ifcCF 

10 then return (success, 1) 

11 if is-min(min(t~ l H), G{ cn F} n G< c -f,g f )) m false 

12 then return (failure, 0) 

13 a «— an element of c — F. If w is supplied and w £ F l , choose a so that w 1 ' 1 G a H . 

14 W<-0 

15 for each If in H/H a 

16 do (r, w) <- transport (G, H a , t't, c, 5, U, k, w) 

17 ifr = SUCCESS 

18 then return (success, wt 1 ) 

19 else W «- WUw 

20 return (failure, W) 
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The pervasive change is that transport now returns two values, a success or failure 
marker and then either a watching set (if failure) or the usual group element if success. 
There are the following cases: 

1. If the desired watched literal w cannot be in the image c gt , we fail immediately without 
watching anything. To see if w G c gt for some g, we check to see if either w is already 
in the fixed portion of the image F l , or if w 1 ' 1 G c g for some g. This last condition is 
equivalent to the requirement that w 1 " 1 be in c 11 , the orbit of c under H. 

2. If the clause is mapped into the set 5, we can prune immediately. We need to record 
any disallowed element s to which the clause is mapped as the watching set; as long 
as s G S, the node will continue to fail. 

3. If the clause will overlap U by more than k, we return the reason for the eventual 
overlap, which is now computed by the overlap function. We have modified overlap 
to accept as an additional argument a limit for the allowed overlap. 

4. If the procedure succeeds because the entire clause has been mapped (line 9), we return 
the permutation as usual. 

5. If this node is pruned for lexicographic reasons (line 11), we also do not need to return 
a watching set. The node will always be pruned independent of the values of S and U, 
since the lexicographic condition does not depend on S or on U. 

6. If the recursive call succeeds (line 17), we return the permutation computed. 

7. If the recursive call fails (line 19), we combine the watching set for the node expanded 
with the watching set being accumulated from that node's siblings. 

8. If all the siblings fail, we return the accumulated watching set. 

Another change is in line 13, where we choose the point to stabilize so that the watched 
literal w is in its image (unless we have already stabilized such a point). This is guaranteed 
to be possible because we know from line 2 that w l ~ l G c H . 

Note the difference between the lexicographic prune on line 11 and the prunes that may 
precede it; it is only the lexicographic prune that does not need to increase the size of the 
watching set. Given this, it might seem that the lexicographic test should precede the others 
in the implementation. 

This appears not to be the case. As we have remarked, the lexicographic test is the most 
expensive of those presented; moving it earlier (to precede line 4, presumably) actually slows 
the unit propagation procedure by approximately 14%, primarily due to the reduction in 
times for certain satisfiable outliers where the lexicographic, test would otherwise be called 
many times. In addition, the absolute impact on the watching sets can be expected to be 
quite small. 

To understand why, suppose that we are executing the procedure for an instance where 
it will eventually fail. Now if n is a node that can be pruned either by a counting argument 
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(with the new contribution W n to the set of watched literals) or by a lexicographic argument 
using another node n', then since the node n f will eventually fail, it will contribute its own 
watching set W n t to the eventually returned value. While it is possible that W n ^ W n > 
(different elements of F l n S may be selected in line 5, for example), we expect that in the 
vast majority of cases we will have W n = W n * and the non-lexicographic prune will have no 
impact on the eventual watching set computed. 

Procedure 5.43 Given a group H, two sets c, V acted on by H, and a bound k > 0 to 
compute overlap(if, c, V,k), a collection of elements of V sufficient to guarantee that for 
any he H, \d* n V\ > h, or 0 if no such collection exists: 

1 m<-0 

2 W <r-Q 

3 for each orbit X of H 



4 do {Bi} <— a minimal block system for X 

under H for which cC Bi for some i 

5 A=\c\+mm(B i nV)-\B l \ 

6 i/A>0 

7 then m <- m + A 

8 w <^wu{xnv) 

9 ifm>k 

10 then return W 



11 return 0 

Here, we simply accumulate the watched literals from each orbit; if there is no contribu- 
tion to m, we don't need to .watch anything. This procedure could (and should) be adjusted 
slightly to reduce W if there is some excess in that m > k + 1 on line 9. Of course, we can 
return the given watching set as soon as the size of the overlap exceeds the allowed cutoff. 

If the overlap remains sufficiently small, we return 0 to indicate that no prune is possible. 
Note that since the transport procedure guarantees that k > 0, we will never generate a 
prune without adding something to the set W in line 8 of the overlap procedure. 

The min, is_min and transparency procedures are unchanged from the previous versions. 

6 ZAP problem format 

Historically, Boolean satisfiability problems are typically in a format where variables corre- 
spond to positive integers, literals are nonzero integers (negative integers are negated literals), 
and clauses are terminated with zeroes. The so-called dimacs format precedes the actual 
clauses in the problem with a single line such as p cnf 220 1122 indicating that there are 
220 variables appearing in 1,122 clauses in this problem. The problems in the Velev suite, 
for example, conform to this format. 

The numerical format described here makes it impossible to exploit any existing under- 
standing that the user might have of the problem in question; this may not be a problem 



85 



Case 8585 



for a conventional Boolean tool (since the problem structure will have been obscured by 
the Boolean encoding in any event), but was felt to be inappropriate when building an 
augmented solver. We felt that it was important for the user to be able to: 

1. Specify numerical constraints such as appear in cardinality or parity constraints, 

5 2. Quantify axioms over finite domains, and 

3. Provide group augmentations explicitly if the above mechanisms were insufficient. 

Let us describe the provisions made in each of these areas. 

Cardinality and parity constraints The general form of a zap axiom is 

quantifiers symbols result 

10 where the quantifiers are described in the next section and the symbols are essentially a 

sequence of literals. The result includes information about the desired "right hand side" of 
the axiom, and can be any of the following: 

• A simple terminator, indicating that the clause is Boolean, 

• A comparison operator (>, <=, =, etc.) followed by an integer, indicating that the 
15 clause is a cardinality constraint, or 

• A modular arithmetic operator (7,n=) followed by an integer m, indicating that the 
sum of the values of the literals is required to be congruent to m mod n. 

Quantification In order for quantification to be used successfully, ZAP needs to be able 
to work with predicate symbols. Each zap input file therefore begins with a list of domain 

20 specifications, giving the names and sizes of each domain used in the theory. This is followed 

with predicate specifications, giving the arity of each predicate and the domain type of each 
argument. After the predicates and domains have been defined, it is possible to refer to 
predicate instances directly (e.g., in[l 3] indicating that the first pigeon is in the third 
hole) or in an unground fashion (e.g., in[x y]). 

25 The quantifiers are of the form 

V(£i,...,X*) 

or 

3(x u ...,x k ) 

where each of the Xi are variables that can then appear in future predicate instances. In 
30 addition to the two classical quantifiers above, we also introduce 

V^i,..., x k ) 
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where the V quantifier means that the variables can take any values that do not cause any of 
the quantified predicate's instances to become identical. As an example, the axiom saying 
that only one pigeon can be in each hole now becomes simply 

V(pi,p 2 )V/i . iin(pi, h) V ^in(p 2 , h) 

or even 

V(pi,P2, h) • " , in(pi, h) V -iin(p 2 , h) 

The introduction of the new quantifier should be understood in the light of our earlier 
discussion where we argued that in many cases, the quantification given by V is in fact 
more natural than that provided by V. The V quantification is also far easier to represent 
using augmented clauses, and avoids in many cases the need to introduce or to reason about 
equality. In any event, ZAP supports both forms of universal quantification. 

Group definition Finally, the user can specify a group directly, assigning it a symbolic 
designator that can then be used in an augmented clause. The syntax is the conventional 
one, with a group being described in terms of generators, each of which is a permutation. 
Each permutation is a list of cycles, and each cycle is a comma-separated list of literals. 
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It will be understood that a reference in the specification to "one embodiment" or 
"an embodiment" means that a particular feature, structure, or characteristic described in 
connection with the embodiment is included in at least one embodiment of the invention. 
The appearances of the phrase "in one embodiment" in various places in the specification 
are not necessarily all referring to the same embodiment. 

The above description is included to illustrate the operation of the preferred 
embodiments and is not meant to limit the scope of the invention. The scope of the 
invention is to be limited only by the following claims. From the above discussion, many 
variations will be apparent to one skilled in the relevant art that would yet be 
encompassed by the spirit and scope of the invention. 
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